AudioCodes and the Opus Transcoding Challenge

By Tsahi Levent-Levi

October 18, 2013

Is AudioCodes’ route of skipping voice transcoding to Opus the right path?

VoIP vendors complain regularly about WebRTC. Most of it revolves around the exact specifications and codecs used by WebRTC and how their difference from other VoIP solutions out there is causing too much interoperability headaches.

Guess what?

They are correct.

Guess what else?

No one cares. Or more like – WebRTC isn’t meant for these guys – it was meant for the web, so any interoperability or legacy considerations get second place at best.

Most times, when processing power is concerned, we (myself included) tend to focus on VP8 and H.264 – obviously, video processing is more complex than voice, so it is the bigger pain. Recently, though, I was made aware of the fact that Opus has its challenges.

As it seems, Opus requires substantial horsepower – to the point of making it hard to run on mobile alongside video when both don’t have specialized hardware assistance. Simply put – one of the reasons WebRTC on mobile is lagging behind is Opus’ performance requirements.

Opus 1.1 brings some good news in that regard. In its beta release, there’s the following text:

[…] it’s worth mentioning it’s not just a little bit faster. For example, 64kbit/sec stereo decode on ARM processors is currently 74% faster (42% less time) and encode is 27% faster (21% less time).

This would definitely help, and if someone knows how that compares against other voice codecs, I’d be very interested to hear.

What this means is that interworking with SIP deployments comes with a cost – transcoding – and not a simple one. It is why AudioCodes’ recent announcement is interesting. They have shown a call from an IP Phone directly to a WebRTC supporting browser that needs no transcoding – Opus voice end-to-end. All previous endeavors in the domain of interoperability were done using G.711 as far as I know, and this one is a first.

Putting Opus on the IP Phone itself – in the end client – and not in an interworking function along the way reduces the overall cost of a system – and it is also where I think it makes the most sense. Sure – there are many legacy deployments where this won’t work, but it is an indication that SIP vendors need to wake up and start aligning themselves with the RFC and codec selection made by WebRTC, and they should be doing it in the clients and not only via a gateway.

If you are a “legacy” VoIP vendor – stop complaining about compatility with WebRTC – no one cares about you anyway. You have two routes:

Interworking via gateway, and paying the price of transcoding
Thinking of WebRTC’s capabilities across your deployment and architecture and not just as an access point

I think the second option makes more sense, and AudioCodes is showing how it should be done.

Choosing the best WebRTC signaling protocol for your application

WebRTC is about reducing friction and barriers of entry

Billy Chia says:

October 18, 2013 at 8:38 pm

Nice post! I agree that vendors will need to get on board. Not sure I’d say, “no one cares” but it’s interesting to think about who cares and who doesn’t.
Just curious, I haven’t heard a lot of interoperability complaints – can you link to some? Thanks!

Reply
1. Tsahi Levent-Levi says:
  
  October 20, 2013 at 7:36 pm
  
  The complaints are verbal – wherever I go and explain about WebRTC, people start fussing about Opus and VP8 – how they will need transcoding – how they don’t exist in this or that system – etc
  
  Reply
Michael Graves says:

October 18, 2013 at 10:34 pm

Hmmm…the press release is not very specific or detailed. Did they implement the Opus codec? Or something more? Signalling is still an issue. Might some kind of gateway may be required even if the media stream doesn’t need to be transcoded?

Reply
1. Yossi Zadah says:
  
  October 21, 2013 at 5:45 pm
  
  Hi Michael,
  
  Thanks for your comment.
  Opus was implemented on the AudioCodes IP phone. For signaling we are using SIP. The server is handling all the required ICE functionality as well as bridging with WebSockets.
  
  Yossi Zadah (Solution Marketing, AudioCodes)
  
  Reply
Lawrence Byrd says:

October 19, 2013 at 12:31 am

I am not yet fully convinced that your “no one cares about you anyway” and “who the F&#k cares about signaling” realism will be immediately persuasive to all the traditional organizations out there seeking direction :). But I’m probably wrong and I am sure it will!

None-the-less, I agree with you and think that high-value use-cases like customer service and online customer sales and interaction are likely to be enough to justify the performance, low-latency and (most importantly) the wide-band audio quality of simply putting Opus on the agent/enterprise end as well. If your distributed agent interface is now embedded in a browser, then why not since OPUS is arguably the best codec in the browser? Even if you want to use physical phones, these are starting to support Opus as well as we see from Audiocodes and I am sure others in the future. So I think certain use-cases will drive this and I like Audiocode’s approach. For less valuable calls that need to end up on someone’s legacy SIP/H.323/Digital/Analog deskphone, then yes there will need to be conversion, so I think you will see both approaches mixed, plus there’s also PSTN for that…

Reply
1. Tsahi Levent-Levi says:
  
  October 20, 2013 at 7:35 pm
  
  Lawrence,
  
  While you are correct, I am not someone who is known for being too polite or suttle about what I have to say. In my old age, I don’t think it is curable any longer.
  
  Reply
  1. Lawrence Byrd says:
    
    October 20, 2013 at 11:20 pm
    
    Well someone in this new disruptive age has to be the noisy and unsubtle voice in the wind, so it’s good for all of us that you have graciously volunteered, Tsahi!! Own it.
    
    Reply
Greg Cording says:

October 20, 2013 at 2:14 pm

Whilst I get the line that incumbent PBX / VoIP / UC vendors are somewhat irrelevant in that WebRTC was meant for the web, establishing a direct peer-to-peer session is not the complete picture in providing an end to end call or session flow.

Incumbents have many years of experience in developing Contact Center process flow / algorithms for agent selection and handling callers in queue etc.

Whilst WebRTC will undoubtedly dissolve the borders between web/internet initiated contact and the Enterprise Contact Center, elemental processing of the Call/Session processing will still be required for Queuing, Agent Selection based on skills etc.

So whilst from a developmental RFC view point they might seem irreverent it is they that will ultimately adopt the RFC for their Contact Center solutions and as a major stakeholder are quite entitled to have a moan about getting WebRTC to a place where products can be built to leverage their 50-60 years worth of Contact Center experience.

Reply
1. Tsahi Levent-Levi says:
  
  October 20, 2013 at 7:31 pm
  
  Never said that ignoring incumbents is good for all, just that it is going to be the case for most.
  
  Us people in the VoIP/UC industries see something that looks like voice traffic and we immediately tend to put SIP or IMS on it. Somehow, it isn’t correct most of the times now.
  
  Reply
2. Lawrence Byrd says:
  
  October 20, 2013 at 11:58 pm
  
  My view is that Tsahi is making an extreme point in order to highlight the upcoming value shift. In any value shift the old world does not disappear; we are still running mainframes. In real life every SBC vendor will provide WebRTC gateways with all types of transcoding as needed and every existing contact center and UC vendor will provide WebRTC clients and integrations. So no one needs to feel left out :). But what will the nearly 20million web-style developers be building in the cloud? What’s next in social media beyond Facebook and Twitter? What will it mean as contact centers move to cloud providers, like LiveOps, Five9, Zendesk and many more to come, supporting distributed-anywhere agents building around WebRTC standards since they are there, good and license free? Where will enterprises spend their money when every enterprise-social tool they use from Salesforce/Chatter, Jive, SocialCast, Box or whoever eventually embeds tightly integrated collaborative communications in new ways? There will be H.323/Skinny or SIP phones sitting on desks for as long as we will be alive, but how much investment will be going into enhancing or even touching these? And how valuable will gateways back to these be? They better be cheap! These are value shift questions – questions about how future enterprise investments may be different and how different vendors may be able to take advantage of the disruption.
  
  Reply
Diego says:

November 22, 2013 at 6:36 am

For the last decade we saw end point vendors adding different codecs to their offer:
G.711, G.729, G.723, iLBC . Then HDVoiP with G.722, AMR_WB, iSAC SILK.
I dont see why IP phone vendors will not follow the same pattern and implement now OPUS natively. Most of OTT out there are using OPUS already !!!
I think AudioCodes move is very smart and show the path to the other “legacy” end point vendors to follow.

Reply
Joe Mauk says:

January 18, 2014 at 3:35 am

Audiocodes talks a good game (re: Opus codec). However, the fact remains that they have yet to release a phone the has the Opus codec. The press release regarding the 420HD and Opus is vaporware. Even more than 6 months after the announcement, there is no Opus codec loaded in any Audiocodes IP phone.

Reply

AudioCodes and the Opus Transcoding Challenge

You may also like

Choosing the best WebRTC signaling protocol for your application

WebRTC is about reducing friction and barriers of entry

Leave a Reply