Choosing the best WebRTC signaling protocol for your application

By Tsahi Levent-Levi

June 23, 2025

Deciding on WebRTC signaling? Explore standardized and proprietary protocols to find the best fit for your needs.

WebRTC comes without a signaling protocol. This means that you need to choose your own for your application. You can choose a standardized protocol for your WebRTC application. Maybe SIP or XMPP or something else. Or you could go for something proprietary – tailored to your specific custom needs.

Which signaling protocol is best for your WebRTC application? It depends. And it is what we’re going to try and find out today.

Table of contents

TL;DR – when to use what?
The video version
WebRTC Signaling 101
- What is signaling and why do we need it for WebRTC?
- There’s signaling and there’s transport…
Standard signaling: SIP over WebSocket
Standard signaling: XMPP
Standard signaling: MQTT
Standard signaling: Matrix
Standard signaling: WHIP and WHEP
Proprietary signaling protocol
Still confused?

TL;DR – when to use what?

Let’s start with a quick answer to satisfy curiosity. Here’s my own set of rules on how to make such a decision:

If your application already uses a “chat” protocol to send messages between users for some communications, then just extend that solution to include WebRTC signaling
- This can be a VoIP product that uses SIP (then you’ll need SIP over WebSockets to get to browsers and WebRTC with it)
- It can be XMPP if you’re more into messaging or MQTT if that’s more of an IOT type application
- Or it can be some other signaling protocol that I am just not aware of. It happens
The application has some kind of a messaging bus that is used for communication with users or between users? Use it
- This can be a simple WebSocket or REST or HTTP protocol (a proprietary one) that has been used before. I always give as an example a dating app that already has a way for people to schedule their blind date
- It can also be a managed cloud messaging service such as Ably, Pubnub, Pusher or others
- Here, you’ll need to introduce new types of messages and have your WebRTC SDP and control logic piggyback on that same signaling solution
Using media servers, most probably an SFU? These come with their own client SDKs and reference apps
- Sometimes, it is easy and better to just adopt these and be done with it
- You will need to extend them as your application evolves, but they do give a simple starting point
Do you send only or receive only? Try using WHIP or WHEP
None of the above? Just create a proprietary signaling protocol to exactly fit your needs

The video version

I’ve shared the gist of this also as a video, if that’s your preference:

WebRTC Signaling 101

WebRTC is a modern and powerful media engine. The thing is, you need to direct it in the right way to get it started.

I have a couple of questions for you:

How exactly do users register to a service?
How do they indicate that they are available?
How can one user search for the status of another?
How can he reach out and dial? Or alternatively, how does one join a virtual meeting room? Or an online live stream?

These questions aren’t answered by WebRTC. They are answered by a signaling protocol.

What is signaling and why do we need it for WebRTC?

A signaling protocol is there to answer the questions above.

It does so in a standardized way (hopefully, written down and well documented so it is easy to follow and implemented by others as well).

You’d think it makes sense to have a signaling protocol in WebRTC, and you’d be correct!

But there isn’t…

Here’s what I wrote over 10 years ago about the death of signaling:

The decision not to add signaling to WebRTC might have been an innocent one – I can envision engineers sitting around a table in a Google facility some two years ago, having an interesting conversation:

“Guys, let’s add SIP to what we’re doing with WebRTC”

“But we don’t have anything we developed. We will need to use some of that open source stuff”

“And besides – why not pack XMPP with it? Our own GTalk uses XMPP”

“Go for it. Let’s do XMPP. We’ve got that libjingle lying around here somewhere”

“Never did like it, and there are other XMPP libraries floating around – you remember the one we used for that project back in the day? It is way better than libjingle”

“Hmm… thinking about it, it doesn’t seem like we’re ready for signaling. And besides, what we’re trying to do is open source a media engine for the web – we already have JavaScript XMPP – no need to package it now – it will just slow us down”

WebRTC was “rushed”. Google had an implementation ready to be baked into the browser. Figuring out signaling and making a decision by committee at the standardization organizations would have pushed the actual adoption and use by at least 5 years (and I am optimistic here).

So deciding to use something that existed such as SDP as the API interface layer (because they had it already in the implementation mind you), and just let the developers figure out how to send these messages on the network was the result.

Is SDP good? Yes. It works.

Is it perfect? Hell no. It is horrible.

But it is what we have and it is what we use.

👉 While we’re talking about SDP, there are plans to get rid of SDP munging as an interface in WebRTC. The question isn’t if this will happen but when. Make sure you are ready for it.

Our WebRTC Insights clients already received an action plan to rid themselves of SDP munging in a controlled way. If you want to be ahead of the curve in everything WebRTC, then you may want to check out our service.

There’s signaling and there’s transport…

You can’t just send your signaling message over TCP or UDP. I mean you can – but not if you want this to occur in a web browser. There is no programmable interface that enables that.

What you do is either use HTTPS or a secure WebSocket. Because that’s what’s available in web browsers for you to use. With HTTPS, there’s REST, XHR and SSE – all mechanisms that transform HTTPS from a page fetching mechanism to something that can do “messaging”.

On top of these transport mechanisms, we can place our signaling protocol.

Why the distinction? I am not sure, but here are a couple of reasons that come to mind:

The transport protocol is always standardized, while the signaling protocol can be proprietary
You can use different transport protocols for a signaling protocol. For example, SIP can work over UDP, TCP, TLS and WebSocket
Because with networking, we like thinking in layers

Standard signaling: SIP over WebSocket

One of the most common signaling protocols we have for VoIP is SIP.

Most of the backbone of the telephony companies is based on SIP or a variant of it. For the most part, I regard that world as PSTN – making a phone call to a phone number not using a specific app.

Incidentally, it also uses SDP (not really – it was on purpose but in an opposite way – the media engine used originally as the baseline of Google’s WebRTC implementation had an SDP interface because it was meant to play nice with SIP).

To make sure SIP can work in web browsers, it needed a few minor changes. RFC 7118 is the standard that was created for that purpose – it enables SIP to work on WebSocket as a transport layer and then with WebRTC as its media engine.

The end result? You can use SIP over WebSocket as your signaling in a WebRTC application.

When to use it?

Your app is SIP based and you just need to enable some of the users to connect to your existing network from web browsers
You know and love SIP. And you feel confident in being able to use it in web browsers using Java Script (this one is less likely)

When NOT to use it?

Your app doesn’t have any connectivity to SIP or PSTN networks. And you’re not a SIP expert
You have connectivity to SIP or PSTN but that’s marginal and not the main focus of your application (if you’re doing a contact center that has standard phones on one end and web browsers on the other, then SIP is most likely for you)

Standard signaling: XMPP

XMPP is the standard originally used for presence and messaging. It was also what Google used for Google Hangouts back in the day before it was rebranded as Google Meet and before WebRTC was even announced.

It is quite the common protocol, so making use of it with WebRTC makes sense. Especially if you want to add voice and video communications to your app.

When to use it?

Similar to SIP, I’d use it if XMPP is already at the core of my application. There’s no point in using yet another signaling protocol next to it
If you know XMPP well, you might as well use it. Assuming you’re comfortable with that decision

When NOT to use it?

If you don’t use XMPP already and don’t know it, I’d skip
Your application doesn’t have a lot of messaging beyond just the pure signaling needed to get WebRTC sessions started

Standard signaling: MQTT

Then there’s MQTT. This is a signaling protocol designed first and foremost for the Internet of Things. Its purpose is to collect telemetry from devices.

Why mention it here? Because Facebook Messenger uses MQTT as its signaling protocol. And Messenger is one of the biggest WebRTC applications out there by usage.

When to use it?

If your application already makes use of MQTT for its messaging
Like XMPP, if you know MQTT, you might as well use it. Assuming you’re comfortable with that decision

When NOT to use it?

In all other cases
I simply don’t know how commonplace this protocol is in our industry, and I’d rather use a well known solution or one I built myself than something that has been around for years, but wasn’t adopted widely by my industry. Not because it isn’t good – but because other solutions seem good enough and more well known

Standard signaling: Matrix

I think it is time I recognize Matrix as a standard signaling solution…

Matrix is rather new and was introduced and built with federated decentralized communications in mind. Big words. I am not going to explain them here.

It comes with an open source implementation of both server and client in multiple programming languages and a managed service on top for those who need it – Element

And yes. It can be used for WebRTC as its signaling protocol.

When to use it?

Think of it as all or nothing. If you use Matrix and its client and server side code for the benefits they offer (messaging, decentralization, etc), then choose it

When NOT to use it?

Don’t pick and choose pieces of it to form a signaling protocol

What I am trying and failing to say here is that you should pick Matrix if the open source app it comes with is very close to your own intended application behavior.

Standard signaling: WHIP and WHEP

Then there are WHIP and WHEP. These ARE WebRTC signaling protocols in the sense that they were designed and defined specifically for WebRTC – they aren’t used for anything else.

They are simple and limited in scope and capability.

When to use it?

For unidirectional streaming, check if WHIP and WHEP are for you
If you plan on having third party devices stream into your service (think about OBS as an example) or if you want to support some future generic players then WHEP (future because this is too early)

When NOT to use it?

What you’re doing is bidirectional in nature
You don’t care about an ecosystem or third parties and adding WHIP or WHEP only complicates things even if only a bit

Proprietary signaling protocol

You decide what you want here.

Sit down and write what type of messages you need to be able to pass. What information these messages convey. Decide on their structure and method of parsing (JSON anyone? Maybe protobuf? Something else?). Figure out what transport you want to use. Document and implement.

Be sure to make it a wee bit extensible with the ability of versioning.

When to use it?

If you already have something that can be viewed as signaling in your service. Then you just extend it this way
When you don’t find any reason to use any of the standardized signaling protocols

When NOT to use it?

Only if you lean into a standardized protocol due to reasons I’ve given in the previous sections

For me? A proprietary signaling protocol is likely going to be the way to go for a lot of the use cases that come my way.

Still confused?

I hear you.

Making a decision isn’t always simple and choosing a solid WebRTC signaling protocol for your application is one of these times.

Here’s what I can suggest:

If you picked the proprietary route, then our WebRTC: The Missing Codelab course has just switched from being a paid course to a free course. Enroll to learn more about this as part of that course.

If you want assistance in making the decision, just contact me.

You may also like

Answering ChatGPT questions about WebRTC

WebRTC is about reducing friction and barriers of entry

Leave a Reply