AudioCodes and the Opus Transcoding Challenge

October 18, 2013
Is AudioCodes' route of skipping voice transcoding to Opus the right path? VoIP vendors complain regularly about WebRTC. Most of it revolves around the exact specifications and codecs used by WebRTC and how their difference from other VoIP solutions out there is causing too much interoperability headaches. Guess what? They are correct. Guess what else? No one cares. Or more like – WebRTC isn't meant for these guys – it was meant for the web, so any interoperability or legacy considerations get second place at best. Most times, when processing power is concerned, we (myself included) tend to focus on VP8 and H.264 – obviously, video processing is more complex than voice, so it is the bigger pain. Recently, though, I was made aware of the fact that Opus has its challenges. As it seems, Opus requires substantial horsepower – to the point of making it hard to run on mobile alongside video when both don't have specialized hardware assistance. Simply put – one of the reasons WebRTC on mobile is lagging behind is Opus' performance requirements. Opus 1.1 brings some good news in that regard. In its beta release, there's the following text:
[…] it's worth mentioning it's not just a little bit faster. For example, 64kbit/sec stereo decode on ARM processors is currently 74% faster (42% less time) and encode is 27% faster (21% less time).
This would definitely help, and if someone knows how that compares against other voice codecs, I'd be very interested to hear. What this means is that interworking with SIP deployments comes with a cost – transcoding – and not a simple one. It is why AudioCodes' recent announcement is interesting. They have shown a call from an IP Phone directly to a WebRTC supporting browser that needs no transcoding – Opus voice end-to-end. All previous endeavors in the domain of interoperability were done using G.711 as far as I know, and this one is a first. Putting Opus on the IP Phone itself – in the end client – and not in an interworking function along the way reduces the overall cost of a system – and it is also where I think it makes the most sense. Sure – there are many legacy deployments where this won't work, but it is an indication that SIP vendors need to wake up and start aligning themselves with the RFC and codec selection made by WebRTC, and they should be doing it in the clients and not only via a gateway. If you are a "legacy" VoIP vendor – stop complaining about compatility with WebRTC – no one cares about you anyway. You have two routes:
  1. Interworking via gateway, and paying the price of transcoding
  2. Thinking of WebRTC's capabilities across your deployment and architecture and not just as an access point
I think the second option makes more sense, and AudioCodes is showing how it should be done.

You may also like

RTC@Scale 2024 – an event summary

RTC@Scale is Facebook’s virtual WebRTC event, covering current and future topics. Here’s the summary for RTC@Scale 2024 so you can pick and choose the relevant ones for you.

Read More