AV1 is coming to WebRTC sooner rather than later. Apparently so is HEVC. It is an AV1 vs HEVC game now, but sadly, these codecs are unavailable to the “rest of us”.
WebRTC codec wars were something we’ve seen in the past. During the early days of WebRTC there have been ongoing discussions if the mandatory video codec in WebRTC should be VP8 or H.264. The outcome was to have both of them mandatory to implement in browsers.
Fast forward to today, and life is simply. We have ubiquity and support across all browsers that have WebRTC in them, which is great.
We are now gearing up for the next fight. This one isn’t going to be between VP9 and HEVC, but rather between AV1 and HEVC.
If you are looking to learn more about the differences between the WebRTC codecs, be sure to read this article as well: 🎲 Which video codec to use in your WebRTC application? 🎲
Table of contents
- Why now?
- A brief history of WebRTC video codecs
- Where in the world is WebRTC VP9 video call?
- Apple’s appetite for HEVC in WebRTC
- WebRTC AV1 support in Google Duo
- A quick FAQ on the latest WebRTC video codecs
- WebRTC and the future of video codecs
- WebRTC differentiation: the next battlefield lines are being drawn
COVID-19 is causing all communication vendors to fast forward and accelerate their roadmaps by 6-18 months. Those that don’t are going to be left behind on the other side of this pandemic.
This isn’t an attempt to scare anyone or to FUD people into doing things. It is just the way things are.
If you want to see how serious things are, just check what’s going on around you:
- Zoom is rolling out new features on almost a daily basis. They are plugging in their security gaps faster than most vendors can plan their roadmap, let alone develop anything
- Google is releasing features on Duo and Meet to drastically improve them. A lot of it hinges on machine learning but also on the latest coding technologies (more on that later, when we get to AV1 again)
- Most UCaaS vendors have launched their own video meeting service in the past 6 months. A lot of them in the last month. Many now offering it for free
- Many vendors in the video space from all verticals are seeing a 10x or more increase in use
- There’s a race towards filling different gaps when comparing these meeting services versus Zoom
The AV1 vs HEVC angles here are VERY interesting.
HEVC requires royalties and is a licensing mess.
AV1 is so new it hasn’t even had an opportunity to cool down a bit after being taken out of the oven. Frankly? It is still half baked and requires a bit more cookin’ – and yet… it is now being rolled out in Google Duo.
The thing is, that 6 months back, video was nice to have. A feature that needs to be ticked in a long requirements list.
Today? Video first. All the rest comes later.
Zoom’s stock price and market cap is the best indicator of that change.
A brief history of WebRTC video codecs
In less than 10 years, we’ve witnessed 3 codec generations in WebRTC:
- VP8 / H.264
- VP9 / HEVC
With each generation of codec introduced, CPU and memory requirements grow along with the complexity of the codec and the resulting quality for a given bitrate increases.
I’ve been working with H.264 since 200x. Probably somewhere in 2005. It was brand new at the time and was about to replace H.263 and all of its extensions.
Fast forward to around 2010, when you started it being deployed in almost all video conferencing room systems.
VP8 came to our lives along with WebRTC, in around 2012. It is comparable to H.264.
There are reasons to pick H.264 over VP8. And while hardware acceleration is more readily available in H.264 than VP8, it does pose challenges.
Both are probably at their peak right now when it comes to video calling:
- They are ubiquitous
- Readily available
- Understood and known, with vibrant ecosystems
- They can run on most CPUs
This is the tipping point, where a new video codec is being sought after.
👉 If you are using it today, you should be just fine. If you seriously want to be at the forefront of technology, right on the bleeding edge (and you will bleed – time, money and blood), then read on to your next alternatives.
And if you need to decide between VP8 and H.264, check out this free video course: H.264 or VP8?
It should have been a VP9 vs HEVC thing and not an AV1 vs HEVC thing.
The next best thing in video codec was supposed to be VP9. VP9 is the replacement to HEVC. HEVC is what comes next after H.264, and the intent was always for VP9 to be the alternative to HEVC.
As things go, VP9 advantages are just what you’d expect in a new codec generation:
- Compression efficiency
- Higher complexity
- Scarcity of hardware acceleration (an issue still)
What VP9 was supposed to bring to the world is SVC – scalability. While VP8 supports temporal scalability, VP9 was touted as a codec that would bring also temporal, spatial and SNR scalability. With VP9 SVC we were supposed to improve resiliency of video as well as the ability to scale large group video calls better than ever before. This never really came to be, as some of these improvements were left out of the official WebRTC APIs until today.
👉 Need a boost and have a very good grasp at who is in a call before everyone joins? VP9 might be a good alternative for you.
- AV1 is the new kid on the block. An impossible dream coming true: vendors working together in a new Alliance of Open Media, working on a royalty free video codec. Something that was never heard of a few years ago and now feels like the new norm
- Starting with 7 founding members, this changed the dynamics of the WebRTC codec wars. Instead of having Google with VP9 on one side of the ring and the rest of the world on the other side with HEVC, it brought a team of large players to the royalty free side, standing behind the AV1 video codec
- Today, the alliance includes 48 members, including all browser vendors and most chipset vendors
- The focus in terms of comparisons are now AV1 vs HEVC
I’ve written at length about AV1 when the specification got released. You can learn about AV1 there.
There are those who believe AV1 is ready and have been ready for quite some time. Reality says otherwise. It isn’t for the faint of heart at this point. More on that – below.
👉 Adventurous? Go AV1!
Where in the world is WebRTC VP9 video call?
VP9 shipped in Chrome 48 for WebRTC. That was January 2016. 4 years later and it is safe to say that not many are using VP9 in WebRTC.
The two main places where VP9 is making sense?
- Google. Google Meet for example has been using VP9 for quite some time in its own calls
- Peer-to-peer calls. Just because it is easy to achieve
Once AV1 was announced, the debate began if one should even try and adopt VP9 or wait for AV1 instead. The majority are waiting for AV1. Laziness at its best (and what I would have selected as well if you’re wondering).
The other reason for delaying and skipping a generation is investment in VP9. Since everyone’s looking at AV1, VP9 is left with less eyeballs and developers improving it. Add to that the slow release of SVC support to it in Chrome and the fact that Safari still doesn’t support VP9 and you can understand the reluctance of going this route.
Apple’s appetite for HEVC in WebRTC
The big Apple is insatiable. Apple has been banking on HEVC for many years now, and where HEVC & WebRTC fits in Apple has been a topic here in the past as well.
On Apple’s release notes for Safari Technology Preview 104 there’s a bullet point that shows where things are headed:
Added initial support for WebRTC HEVC
I wonder whatever for?
- Apple is a founding member of the Alliance of Open Media, so it is banking on AV1 as the future video codec
- In iOS 11 (2017), Apple introduced HEVC to its devices. That was done with the addition of hardware acceleration
- Android devices usually don’t have HEVC hardware support, and licensing being as tough and expensive as it is, this is a continued differentiator for Apple
- Google will be reluctant to add HEVC to Chrome. So would Mozilla. Not sure what Microsoft’s stance would be on this one
- Apple isn’t playing an AV1 vs HEVC game, but rather an AV1 and HEVC game, and they are alone in that at the moment
- Apple isn’t especially strong or dominant in the WebRTC space. Safari is the worst browser these days in terms of WebRTC support, with users already used to switching to Chrome on Mac. What would adding HEVC to WebRTC Safari add? Especially when there are so many other, more basic things to fix and improve in Safari WebRTC support…
To me, this is the biggest conundrum at the moment. A piece of this puzzle is missing. What would make developers use HEVC if it is only available in Safari and nowhere else? This isn’t the app store. It is the web.
Time will tell.
WebRTC AV1 support in Google Duo
I said it before and I’ll iterate it again. AV1 is too new. Too early to be adopted in WebRTC or real time communications. And yet… Google just announced supporting AV1 in Google Duo:
[…] in the coming week, we’re rolling out a new video codec technology to improve video call quality and reliability, even on very low bandwidth connections.
They made sure to add a nice moving GIF so you can see the difference between “a video codec” and AV1 in the same bitrate.
Is that other codec VP8? VP9? H.264? HEVC? Maybe H.261…
Are they using it for all Duo calls? In all devices? In all network conditions?
The only thing I could find is that this rolls out to Android with iOS 2 weeks behind in the roll out. There are more things left unsaid.
Some thoughts here
- AV1 doesn’t have hardware acceleration on smartphones. Maybe on 1 or 2 very new ones (I doubt it), and even then, the hardware would still be buggy as hell – especially for real time video, which is different than just camera recording or playing YouTube videos
- This means that going to HD resolutions with AV1 on smartphones is going to be brutal to CPU, battery life and device temperature. This isn’t where AV1 support in DUO is going
- This leaves us with the low bitrate scenario – probably anything from VGA or lower. Maybe even a quarter of that (QVGA)
- It is where AV1 is going to shine in 2020 and into 2021
We’re all stuck at home burning the networks. The large streaming vendors are lowering resolutions (and bitrates) for their default players in certain countries. This reduces the CPU load, making room for improving quality on lower bitrates. And that leads to the ability (and need) of better video codecs.
Why not VP9?
Google Duo most probably already makes use of VP9. Maybe even HEVC on iOS devices due to hardware acceleration benefits. When it comes to 1:1 sessions, there’s no real reason to stick to a single video codec for all sessions.
With Apple working publicly now on HEVC in WebRTC, it put pressure on Google, and getting AV1 into Duo in order to bolster their side in the AV1 vs HEVC debate became a pressing matter. Google Duo’s 1:1 call scenarios were the most suitable candidate for Google to make that stand.
When a new video codec generation was introduced, the thinking was simple: “we are expecting it to support a higher resolution, at a higher bitrate, with a higher CPU consumption”
- Higher resolution, because let’s face it – QVGA sucked in 1995 and we were still using it in 2000 in video conferencing. So each generation had to get 4 times the pixels the previous one was capable of dealing with
- Higher bitrate, because at 4 times the pixels we couldn’t really get 25% the size, so there was an expectation of needing more bandwidth for the content we wanted to use
- Higher CPU consumption, because we were adding more work to the encoder and decoder
In 2020, things are changing.
Bigger is no longer better with video codecs
I have 4K resolution on my desktop and laptop. 1080p on my phone and TV. I am happy with 720p content most of the time. I hate fonts on a 4K screen that aren’t enlarged (the damn characters are just too small to read).
What is the value of higher resolution? HDR content? 8K? 360? VR? If all I need is just plain video, no higher resolution is required. We’re all content most of the time with 720p resolutions for business meetings anyway.
Resolution requirements for most content types and use cases are not going to get higher any time soon.
We are probably at peak resolution already.
So we are free to think of next gen video codecs as ones that help consume lower bitrates.
There’s a distinction here. While any new video codec generation consumes lower bitrates for the same resolution/quality, the main purpose of these new video codecs was almost always in increasing the resolution as well.
👉 AV1 on mobile makes perfect sense here. Especially for low resolutions – since we can have some CPU to spare for that scenario.
A quick FAQ on the latest WebRTC video codecs
No. Not officially.
Apple is adding support for it in Safari, but no other browser has added support for it or indicated plans to add support for it
Yes, but not in browsers.
Apple will introduce HEVC in Safari, but no other vendor will. If you build your own native application for either PC or mobile you can add HEVC as another supported codec and use it in your application.
That depends. If you want to add AV1, you need to make sure your use case fits well, as well as the devices you expect your users to have.
You will also need to put a considerable investment of time and money to make it happen.
My suggestion for most vendors would be to wait with AV1 support.
That is a good question with no good answer.
I believe it is a matter of timing. When the time came to adopt VP9, AV1 was already announced and on its way, so vendors preferred to wait and jump directly to AV1 instead of going for VP9.
VP9 doesn’t enjoy much hardware acceleration, which also makes it CPU intensive, requiring companies to tweak, fine tune and optimize their systems to use it. That kind of work is something many prefer not to do.
WebRTC and the future of video codecs
We’re at war again. The video codec war of WebRTC. And this time, each vendor needs to pick a strategy to play.
We’ve got multiple codecs in our warchest: VP8, H.264, VP9, AV1 and sometimes even HEVC now.
Which one will we be using?
Which ones will we be using?
Here, scenarios matter. Different scenarios will call for totally different video codec selection to optimize for quality, CPU use, performance, bitrate, cost, etc.
In 1:1 sessions, you may want to keep your options open – use the best one dynamically just by making a decision as the session is set up.
For group calls, will you be using a single, static video codec? Or allow for multiple ones? Will you have multiple codecs in a single group session? Are you going to have an SFU tweaked and tuned for that? Will you pick the best video codec for a session and then dynamically switch over as the nature of the session changes (=someone joins and leaves who has certain limitations)?
What about consumers? What kind of video codec selection strategies are going to be prevalent there? How are they going to be different than the ones we see in enterprise solutions? What will be the difference for mobile first or application based versus web based solutions?
WebRTC differentiation: the next battlefield lines are being drawn
We live in interesting times.
Codec selection has never been more interesting or important.
While WebRTC offers 2 codecs (H.264 & VP8), most browsers support VP9 and now we’re seeing browser vendors either adding HEVC or using AV1 in their own apps. Audio now faces a similar challenge, with both Microsoft and Google introducing AI-powered voice codecs.
If media quality is at the core of your service (think carefully about your answer to this question), then rethinking your video codec selection strategy might be in order.
It is going to require research and investment. But this is where the future lies for video codecs in WebRTC.
Interesting article, brings back some memories… I think there is a small typo:
“VP8 came to our lives along with WebRTC, in around 2012. It is comparable to VP8.”
You probably meant “It is comparable to H.264”.
Fixed – thanks for catching this one.
What is the difference between codec in app versus codec on web
A codec on the web is defined and implemented by the browser vendors. You are limited to what they decided you can use.
Inside an app you are free to use whatever you want as you have full access and control over the application’s code.
720p ought to be enough for anybody
Maybe. There are those who need “more”. I am guessing that on 65″ displays or larger, 1080p would be better.
Sorry, but HEVC is dead. It isn’t in Firefox, it isn’t in Chrome. MPEG are rushing VVC in a vain attempt to salvage the situation before H264 goes out of patent in a couple of years.
It doesn’t matter if it’s in Safari. Apple doesn’t really care about video codecs: they only included HEVC to be up to date for their professional video users at a time when it was thought that HEVC would take off.
In reality H264 is good enough for 99% of use-cases so, just like there are better image formats than JPEG, there isn’t a non-geeky impetus to move forward. Well… unless you are YouTube or Netflix… and they picked VP9/AV1.
Now I know why my phone is struggling to play a 60fps 1080p AV1 video.
I think you’re missing the forrest for the trees. Steve Jobs once said: “I’ve always been attracted to the more revolutionary changes. I don’t know why. Because they’re harder.”
AV1 was designed to integrate with the next wave of WebRTC video innovation: e2e encryption, SVC and codec-independent forwarding. So it’s not about the video codec, but rather the next generation architecture.
1. With WebRTC now incorporating e2e encryption via Insertable Streams (and SFrame), and NSA now recommending e2e security, conferencing systems need an RTP header extension to forward packets since the payload may be opaque. So if a browser and codec doesn’t support Insertable Streams or a forwarding header extension integrated with the next generation codec, it will not meet NSA requirements, and conferencing vendors won’t be able to provide full functionality.
2. SVC support is important for conferencing. AV1 has SVC built-in; in HEVC it is an extension. The Dependency Descriptor (defined in the AV1 RTP payload specification) is superior to the Framemarking RTP header extension for spatial scalability modes. If a browser (and next generation codec) doesn’t support SVC along with a forwarding header extension, it won’t be competitive.
3. AV1 includes screen coding tools as a basic feature, not an extension as in HEVC. This is a major competitive advantage for conferencing.
Bernard, thanks for this.
I haven’t even gone into technical superiority as I think it doesn’t matter at this point.
AV1 hardware acceleration exists in %0.01 of the compute devices that WebRTC is available on. HEVC probably in %5?
Both codecs aren’t really ready for widespread use when it comes to real time communications outside of specific walled gardens and restricted scenarios. Both codecs aren’t implemented yet in web browsers (they have some kind of limited availability but not more).
Until AV1 is available everywhere (=a lot more than where you can find VP9 today) it will be fighting against HEVC.
So what to choice for WEBRTC? We did h264, but some phones on MTK cpu, lack of support 🙁
what general recommendation to put array of servers to transcode to vp8/9?
Most vendors use VP8 end-to-end today.
Thanks Tsahi, there any usb cameras that can produce VP8 natively?
Not that I am a ware of.
Hi. I just want to clarify that a LOT of android devices support HEVC. They have for several generations (flagships for about 7 years in decode albeit only 8bit in the beginning) to the point where it has trickled down throughout almost all of Qualcomm’s portfolio (even all smart displays on the market support 8bit HEVC). And I remember 5 years ago I had a cheap mediatek powered tablet (about $150) which also had HEVC support (10bit if I’m not mistaken).
And most support it in encode and decode now. In fact more Android devices support HEVC than VP9 (VP9 hardware support generally came some years after HEVC so more older devices lack it compared to HEVC).
Of course, whether communications programs employ it is another matter and there I have no idea (it would behoove them IMHO due to the numbers vs VP9 for compatibility’s sake, but who knows when it comes to royalties). Media playback is great though =)
Decoding alone isn’t enough and anything shy of coverage of enough smartphones isn’t good when you factor in royalties. This makes the use of something like HEVC unrealistic for most use cases and developers.