What is the Difference Between a Signaling Protocol and a Transport Protocol?

By Tsahi Levent-Levi

December 4, 2014  

Time for me to sort this one out, as I am the one assisting in spreading this misunderstanding.

I adopted a slide in the past and used it in many occasions to explain the different signaling options available to developers. Here’s one of its variants:

WebRTC signaling options

The problem, which Justin Uberti once pointed out, is that I have made a mistake by combining signaling and transport protocols together. Which I shouldn’t. And I agree with Justin here. Consider this my attempt at ratifying this mistake, and the reason is a question on Quora I bumped into lately. A person was asking about the difference between XMPP and BOSH.

The easy answer to this question, is that XMPP is a signaling protocol and BOSH is a transport protocol. Here. We’re done.

We have low level transport protocols already. They call them TCP and UDP. What these protocols do is allow sending arbitrary date from one point in the network to another. Not many assumptions are made about the data being sent, and it is assumed that some application on top will try to make sense of that data. This part is out of scope.

In our browsers, transport protocols that allow sending arbitrary data from both the browser to the web server and vice versa include XHR, SSE and Websocket. If what you are trying to achieve is sending arbitrary data then you’d pick one of these transport protocols.

Signaling protocols go one step higher. I have this need. I want to be able to express some mechanism – a way to tell the other end something. In our case it can be the need to open a call, my availability, my identification. To that end, I can either invent a protocol to do that or use a predefined protocol – something that people have already agreed upon in the past. This protocol is a signaling protocol.

The predefined ones? H.323, SIP and XMPP. There are more, but these are the main ones used in VoIP and instant messaging.

The SIP signaling protocol uses TCP or UDP for its transport. For WebRTC, there is an adaptation of SIP over Websocket.

XMPP uses BOSH most of the time as its transport when used inside web browsers. BOSH is 2 separate HTTP connections to a server, one used for outgoing messages and the other for incoming ones. XMPP can also use Websocket as transport.

If you don’t want to learn, don’t care about or have no real need for SIP and XMPP, you can forgo them altogether and invent your own proprietary protocol. You can run it on whatever transport you see fit. It can even be a combination of several transport protocols.

Why is this important?

In the not so distant past, we were led to believe that there must be a standardized signaling protocol that everyone uses.

This concept has been broken. While it has its value for many use cases, it holds no value for some use cases and in many cases, there is no business value in adopting a standardized signaling protocol.

There are many business reasons why this came to be. The technical reasons that enabled that?

  1. The adoption of modern transport protocols in web browsers (mainly Websocket) and the wide use of it in the web
  2. The adoption of a media engine with a standardized API (call it WebRTC)

Together these two made it easier than ever to just use whatever protocol necessary, relegating the whole idea of communications from a standalone service to a feature in another service with its own signaling needs.


You may also like

Leave a Reply

Your email address will not be published. Required fields are marked

  1. “In the not so distant past, we were led to believe that there must be a standardized signaling protocol that everyone uses.
    This concept has been broken. While it has its value for many use cases, it holds no value for some use cases and in many cases, there is no business value in adopting a standardized signaling protocol.”

    I strongly disagree when you say standardized signaling protocols (I’m assuming SIP or XMPP) many times hold no value. Even if the purpose of a given WebRTC application is not to communicate with other devices that speak SIP or XMPP, I’d highly recommend using a standardized protocol unless it’s one of most trivial of applications.

    Here are some reasons why:
    1. There is no need to reinvent the wheel. There are lots of edge cases for starting, stopping, bridging, and ending sessions as well as media negotiation. Make use of the knowledge of the MANY who have encountered these problems before in many diverse systems.
    2. The protocols are very worn and well tested. There have been RFCs in existance for years.
    3. Interoperability and network federation. The world doesn’t need another wall-off communications network, be a bridge not a wall.
    4. Most importantly, the existing open source and proprietary ecosystems are huge. There are higly scalabe client-side and server-side implementations in every language imaginable, tons of books and documention, and lots of experienced engineers for hire.

    1. Keith, thanks for taking the time to write this comment. I love a good argument 🙂

      I should be the first to defend signaling protocols. Been developing and marketing them for most of my adult life. I do believe that they are becoming less important to developers, especially when I have my own protocol that is used across the service already. Think of a dating site for example. It has its own way of “messaging” between people. A way to handle discovery, schedule blind dates, status updates, etc. It would be far easier to just hook up that last bit of making a real time on line session (I won’t call it a call) into the existing protocol than it would be to try and (brutally) force SIP into that scenario.

      I see WebRTC as a way for services out there to add communications without the reliance on the old guard of telecom players – be it large carriers or VoIP vendors. Once they do, often times going for the “tried and true” of telecom that is SIP (or XMPP) makes little sense.

      1. There is nobody who dislikes the backward telecom industry more than myself. However, SIP does not have to be coupled with telecom or even voip in any way. SIP was a protocol that was designed to help legacy telecom out of the dark ages but many don’t realize it can work for any type of session setup and management. Any custom WebRTC signalling protocol that is around fully featured, in a large scale system, and around long enough will begin to look a lot like SIP or XMPP (if they’re smart they’ll scrap it first).

        SIP can work completely independently from any service provider. Two SIP clients, if configured correctly, can communicate P2P. However, for a slightly more scalable system but without using a voip vendor I’d use http://oversip.net (websocket backend) and http://sipjs.com (javascript client). Using those you’d instantly gain all of the features, reliabilty, interoperability, and scalabilty that it would take an individual a lifetime to achieve.

  2. It shouldn’t matter what signalling protocol one uses as long as it’s complications are abstracted a way. There should be no reason a developer should know, care or even view SIP. When you have something so ugly it’s best kept covered up 🙂

    I agree there should be a modern (without the pre 2010 accumulated baggage) and easy way to signal, but what would be the cost of reinventing something.

    Perhaps a compromise would be to simplify what we have and unload some of that excess baggage accumulated along the way.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}