Here are the WebRTC trends and predictions you should expect in 2024. They are a continuation of what we’ve seen in 2023 with a few variations.
Time to look at what we’ve accomplished in 2023 and think what’s ahead of us in 2024 when it comes to WebRTC.
When we look ahead, there are several notable things that glare at us immediately:
- WebRTC is here to stay. But in some cases and for some use cases, the focus is shifting towards WebTransport+WebCodecs+WebAssembly
- The recession is here and it isn’t going anywhere, so a continuation of what we’ve seen a year ago
- Generative AI is getting all the love and attention out there. It is also finding its way slowly into WebRTC services
Last year, I became CPO at Spearline. This year, Spearline got acquired by Cyara and I am now Senior Director of Product Management there. I am still delving into WebRTC and CPaaS. Still consulting a bit here and there on these subjects when it makes sense.
If you are interested, you can read my last year’s WebRTC predictions for 2023 😀
Let’s get started here…
Table of contents
- The video version
- The era of differentiation in WebRTC
- What does WebRTC use look like?
- WebRTC, open source and XaaS
- How did I do with my 2023 WebRTC predictions?
- WebRTC predictions for 2024
- 2024, here we come
The video version
This year, I took the liberty of also sharing my predictions in a video form. It holds the essence of my WebRTC predictions for 2024, in a short form.
Read on below to get into the details.
The era of differentiation in WebRTC
We are well into the era of differentiation:
I’ve had this slide done somewhere in 2020, modifying it a bit to fit the pandemic.
It is as relevant today as it was last year:
- We started off with WebRTC in an exploratory fashion, asking ourselves should we even use this technology?
- Then we saw a growth spurt, where it was obvious WebRTC is here to stay. The question changed to how do we use it
- That got us right into the age of differentiation, where services from different companies look so alike, using the same WebRTC interface and capabilities, that we now ask ourselves how do we compete
The answers of how we compete varies on a yearly basis. Now, it obviously revolves around generative AI and LLMs. That’s the easy answer. The truth is a lot more complicated and nuanced. It requires understanding where investments are currently made – both at Google and in the ecosystem around WebRTC and its use.
What does WebRTC use look like?
Last year I predicted usage would be 3 times higher than pre-pandemic. That meant lowering the use at the beginning of 2023 from 4 times to 3 times pre-pandemic. The end result? We stayed at around 4 times pre-pandemic usage.
From here, it can only go up, though slowly and linearly but likely after 2024:
- New use cases are unlikely to cause people to start doing more video calls
- Growth ahead will come from shifting on premise solutions to cloud ones and at the same time, migrating to WebRTC use
WebRTC, open source and XaaS
I am not going to touch the topic of open source here. I’ve done that in my article two weeks ago writing about the top WebRTC open source media servers on github.
XaaS requires a few words of explanation, and I am likely to cover them in the coming months in further detail in a separate article.
For me, XaaS is IaaS, CPaaS and SaaS. In all cases, it is a matter of looking at them from the prism of WebRTC APIs 👉 CPaaS.
The landscape is changing in the CPaaS domain. A few years back, the leading vendors for WebRTC APIs were Vonage, Twilio and Agora. Probably in this order.
Here’s what I had to say in my last year predictions article:
The perceived leaders in WebRTC CPaaS are still Twilio, Vonage and Agora. I have a feeling that by the end of 2023 this will change.
Little did I know this would be spot on…
Twilio just announced in December that it is exiting the video business altogether. They still have and use WebRTC for their voice capabilities, mainly with a focus on call centers. But other than that? They just became irrelevant to many developers.
Most vendors are now likely to want to compare themselves now to Vonage and Amazon Chime SDK. Agora probably as well.
From a perspective of innovation or specific market niches, other vendors come to mind as solid alternatives here. Companies such as Daily and Dolby for example (there are others – sorry for not mentioning everyone). Or LiveKit with its open source alternative.
- Twilio all but left the market a year ago, shifting focus to voice and text contact centers and CDPs. In December 2024 they announced sunsetting Twilio Programmable Video service
- Vonage has been working on integrating machine learning pipelines into their SDKs, which is great
- Dolby doubled down on low latency streaming and high end audio requirements
- Daily leads in lowcode efforts and has been putting a lot of attention in the past year towards AI and partnerships
- Agora has just released a signaling SDK and introduced VP9 support
That change at Twilio places more strain on developers who need to choose who to use, with the added new risk of the level of commitment they see in the CPaaS vendor they choose. When someone like Twilio throws you under the bus, what can you expect from other vendors?
SaaS vendors are vying towards CPaaS, assuming for some unknown reason that there’s money to be had from developers.
There are a few that are taking this route.
The problem that I see here is the fact that Twilio decided this isn’t interesting enough. While they have the APIs – they don’t invest in it any further. Meaning it isn’t a big enough market for Twilio. In such an atmosphere, how would it be big enough for SaaS vendors, and how will they see the explosion in use of their infrastructure that they likely haven’t seen in SaaS.
Some of them may yet succeed, but the path here isn’t an obvious or a simple one.
Amazon, Microsoft, Google… and… Cloudflare.
- Amazon has AWS Chime SDK
- Microsoft has Azure Communication Services
- Google has… nothing
- Cloudflare introduced WebRTC services throughout 2023
Let’s see where that takes us
Amazon is investing in Chime SDK. Especially when it comes to audio quality and capabilities. In many ways, Amazon is shifting the attention of developers from CPaaS to their Chime SDK as a solid alternative. This is a trend that should be watched by CPaaS vendors and developers alike.
Microsoft seems content with their current offering of Azure Communication Services. There were no new or interesting announcements around it in 2023, which begs the question – is it important enough for Microsoft and a viable solution for developers?
Google announced APIs for Google Meet. Ones that integrate with it, but not ones that use its infrastructure for me to build my own video experiences. So no luck there for a CPaaS play. Time will tell if this changes. It is unlikely to happen in 2024.
Cloudflare entered the market with much fanfare. I covered them in 2023’s predictions. Since then, there have been no material announcements. Is that good? Bad? I just don’t know.
How did I do with my 2023 WebRTC predictions?
I spent quite a lot of time on my predictions in 2023. Let’s see how well I did.
#1 – libWebRTC (and the future of WebRTC)
I’ve made the prediction that Google’s WebRTC library will focus on house cleaning, optimizing and polishing collaboration. It did all that this year. We see this on an ongoing basis in our WebRTC Insights service.
What was interesting to note, is a slight shift towards requirements coming outside of Google Meet. There’s work being done to include H.265 support in libWebRTC, wherever H.265 is available in a hardware implementation form (i.e – someone is already paying the patent royalties bill).
Is that because Google was benevolent and nice? Is it because they wanted to show they aren’t a monopoly in Chrome? Is it because of some other deal with Intel (the ones pushing H.265 into WebRTC). Or is it simply because they might end up using it in Google Meet in all-Apple devices meetings? Time will tell.
#2 – Machine learning and media processing
I assumed that WebAssembly would continue to be used with WebRTC for media processing in things like background replacement, noise suppression and proprietary codecs implementations.
Some of it was done in WebAssembly and browser level. A lot of it was relegated to the cloud or kept in native applications. What I found interesting, that some vendors chose to announce and release such solutions across all platforms and not start from native and move towards the web later.
Most interesting (and obvious) change here? A lot of this use is now being remarketed as generative AI – doesn’t matter if it is generative or not.
#3 – Voice before video (Lyra first, AV1 later)
The results are… inconclusive.
Webex did come out with a new Webex AI audio codec, with little explanation about it.
AV1 is starting to make real noises of almost-maturity, with Apple supporting AV1 hardware acceleration (for decoding only at the moment) and Google fiddling around with AV1 in Google Meet.
We didn’t hear much this year about Google’s Lyra or Microsoft’s Satin codecs. Just this new announcement of the new Webex AI codec. So I am not sure if voice happened before video or not.
#4 – Observability
Yes. There is more interest in observability. I know that by looking at our numbers in testRTC. There is no specific market or industry where it happens more. What I can say is that many contact centers are starting to take note. Probably due to their increased reliance in WebRTC and the fact that many contact center agents are working from home now.
#5 – M&As and shutdowns
We had a few interesting shutdowns and M&As. The most notable ones?
- Omegle shutdown
- Verizon closing Bluejeans
- Hopin got split, selling “Hopin” to RingCentral, keeping StreamYard
- Twilio shutting down Twilio Programmable Video – and then Jeff Lawson becoming Twilio ex-CEO 🤯
- Spearline was acquired by Cyara. Not necessarily because of WebRTC, but still
A lot of WebRTC engineers found themselves a new home. Either because their startups shut down, their company downsized or they saw no future where they were.
Good talent is there to be had if you look hard enough.
WebRTC predictions for 2024
Enough about 2023. That’s old news. Lets see what’s going to happen with WebRTC in 2024 😎
#1 – libWebRTC (and the future of WebRTC)
I’ll start with the most important piece of our technology puzzle – libWebRTC, maintained by Google.
This year will be a continuation of last year. Mostly maintenance releases, with a few minor improvements. The places where we will see the most amount of focus by Google in libWebRTC:
- Access to media frames, raw and encoded, via Insertable Streams. This will include optimizations and a bit more flexibility. The purpose of it all is to promote and push forward AI capabilities
- Collaboration. A continuation of last year. Some of it via Insertable Streams. Others through polishing of media control APIs in the browser to enhance the user experience
- Accommodating AV1. I believe by the end of 2024, we will finally see Google Meet using AV1 – we’ve just seen a glimpse of that. In some limited scenarios, on select device types. There’s also work being done to allow for VP9 simulcast with hardware acceleration instead of using VP9 SVC
- Voice AI. Google will put Lyra or similar into Google Meet itself. Either as a standalone or by somehow plugging it into Opus or similar. Maybe it will do so via Insertable Streams, but I doubt this will be the route they will take here
By the end of 2024, we will find ourselves similar to where we are at the beginning of it:
- Google will be the main and virtually sole contributor to libWebRTC. The total commit numbers have been dwindling and this will continue. Will we see it stabilize in 2024?
- Here and there, external contributions will happen. Most of them are likely to come with Philipp Hancke. But here as well, we’ve probably seen the peak of individual contributions already…
#2 – Machine learning and media processing
WebAssembly is where we see innovation and differentiation in WebRTC. 2024 will be no different.
It will be incorporated in the “same old places” of media processing.
What we will see is also a lot more machine learning on the server side, and a lot of it will be leaning towards generative AI and LLM technologies. This isn’t really a prediction, but just stating the obvious here. For someone who uses Midjourney for many of his recent articles for imagery, that shouldn’t seem as a surprise to you.
#3 – The year of Lyra and AV1
Time to take a huge risk.
I mentioned this in the libWebRTC prediction, but it deserves a section of its own as well.
Each year I say AV1 is years away. I think it is still going to take time until it becomes commonplace. That said, I believe this year we will see AV1 in one or more commercial WebRTC services, including Google Meet. It will be used judiciously and in very specific use cases and scenarios – call this testing the water.
On the audio side, we will see an AI audio codec being used in production in web browsers. Likely from Google. I believe Lyra will find its way into Google Meet. How exactly is where I am uncertain.
#4 – WebTransport as a real alternative
WebTransport started life somewhere in 2020. We’re now at the beginning of 2024.
It still isn’t available in all browsers – Safari is still missing support for it. It is available elsewhere, but far from being commonly used or in the mainstream’s mindset.
We’ve seen this year a few more experiments and proof of concepts with WebTransport that incorporate low latency media delivery. Mostly in the domain of streaming. There are reasons for that. I’ve written about that when discussing WHIP and WHEP.
Here’s what I think is going to happen: in 2024, we will see the first production ready low latency streaming solution that makes use of WebTransport instead of WebRTC or other technologies. This will be for one-way large scale broadcast use cases, where 1-2 seconds of latency are fine.
There will be those that will use WebTransport for bidirectional media delivery, similar to what Zoom is doing in web browsers, though that will stay the exception of the rule and more of an experimentation.
#5 – M&As and shutdowns
This was easy in 2023 and will remain easy in 2024.
The recession is here. It is likely to stay throughout 2024, with no real end in sight. At least not yet.
More vendors relying on WebRTC will shut down. Small startups will run out of steam. Large vendors may decide to exit this market and focus on other avenues where they conduct business.
Shutting down may mean getting acqui-hired, or acquired for peanuts. It might also mean selling chunks of the business to another company.
Vendors who stick to this market are likely to slow down their efforts throughout the year in an attempt to survive and weather this ongoing storm.
2024, here we come
Lots to do in 2024, but with limited resources:
- Slowdown at the same time we see technology shifts and the need to differentiate
- Generative AI, and AI in general and trying to figure out where it fits in WebRTC use cases
- Polishing collaboration and sharing capabilities in WebRTC and getting that implemented in apps
- Introducing next generation audio and video codecs
- Researching new transport technologies
All that while trying to satiate users and customers with new features and releases.