Who the F&#k Cares About Signaling?

August 19, 2013

Signaling is now a second class citizen.

Sorry for going down this same path yet again, but I think it is an important notion.

Signaling isn’t important any longer.


By that I don’t mean that we don’t need it – I just mean that the struggle for a single bullet, an ubiquitous solution that can be used by all and for everything – that is what isn’t important.

You need to understand – I am coming with a very rich signaling background – from a business unit whose bread and butter was signaling – so for me to say that signaling isn’t important is to throw away over a decade of my life down the virtual drain.

Chris takes it farther still – when we had a chat the other day, he said that we need to tear down what we have developed and start fresh when it comes to WebRTC. That can be rather hard for many vendors, and for some it might not even make sense – but I do urge you to rethink your business model in light of WebRTC – that might need a serious polish.

Why is signaling not important? Because it isn’t interesting. It takes so little to build stuff up today with WebRTC, that just thinking of adding a full-fledged signaling protocol can increase cost and effort in ways not proportional to its contribution.

It is nice to think about it though.

I started “life” in VoIP in the good old days of H.323. It was all binary then – strict messages encoded to save on space.

SIP won over H.323, doing so because it was written in “text” that allowed people to “debug” what they “saw” on going on the line. Honestly? SIP and H.323 are quite the same. The differences are like the ones between Gala and McIntosh (hint: they are both apples).

Then came all the cool XMPP guys trying to displace SIP. It was fun. Whenever I wanted loads of comments on a post I only had to talk about SIP versus XMPP. And again – the differences were rather marginal.

XMPP’s “innovation” was mainly its ability to work over HTTP. So we moved from text to web.

And now with WebRTC? it is any of the above, with this new option of “proprietary”, which usually boils down to JavaScript code implementing *something* over either web sockets or just HTTP requests.

We’ve been dumbing down (or simplifying?) the protocols we use for signaling, and we reached a point that it just doesn’t matter what signaling protocol you use in your WebRTC implementation.

What to choose then? Whatever works for you, based on experience, knowledge, ecosystem, cost or any other factor you need to deal with.

You may also like

Two years of WebRTC Insights

Two years of WebRTC Insights

Your email address will not be published. Required fields are marked

  1. Hi Tsahi,

    I’m in total agreement. When developing XirSys, we decided to make the signalling aspect a second class citizen, too. Quite frankly, if you have your own signalling, then why would you want to use ours? However, at the same time, it is important for us to ensure that we are providing a “one stop shop” for WebRTC, and this includes signalling. For instance, if a developer wishes to augment an application already utilising her own signalling preference, but does not want to maintain a signalling proxy of her own, then that is an area that, in our area of business, we are failing. So what, then, are our options? Is it preferable to support all common formats or should we enforce our own preference?

    Currently, we have opted for a JSON based WebSocket protocol, which is flexible enough to handle some new up-coming technologies we are developing. This, I feel, works well enough. It’s supported by the same browsers that support WebRTC, it’s fast and flexible and it’s a popular choice, but we’re not going to be pleasing everyone. We may have to provide some serious thought into this before long.

    1. Lee,

      JSON+WebSocket is a very sensible choice. I’d beef it up with same SDK/demo/sample/package that access this “protocol” from PHP, Python, Java, .Net, etc. – this will give the widest possible flexibility for your audience – choose the languages to support based on the initial customer requests that you get and grow it from there.

    2. I think there is a question here of the level of access/control depending on your position in the toolset to platform to application spectrum. If you are providing high level identity, session or meet-me abstractions then no-one needs to know how it’s done internally (cf Twitter). If developers are at a lower level and expect all sorts of control then you start getting pushed towards existing signaling standards if you are not careful, although as Tsahi says there will be a premium on how simple yet powerful you can make what you provide.

  2. “””
    Then came all the cool XMPP guys trying to displace SIP. It was fun. Whenever I wanted loads of comments on a post I only had to talk about SIP versus XMPP. And again – the differences were rather marginal.

    XMPP’s “innovation” was mainly its ability to work over HTTP. So we moved from text to web.

    I’m not sure I understand this:

    1) XMPP is an IM protocol; it was never (and still isn’t) trying to displace SIP. SIMPLE might be trying to displace XMPP, or Jingle might be trying to displace SIP, but even those aren’t really true.

    2) XMPP doesn’t run over HTTP by default; it uses plain TCP. It can layer on top of POST requests, but that’s not the normal mode of operation, and if you’re doing voice on the web, then you need to use a bridge from Jingle to WebRTC anyway (which you can do).

    3) We are, however, terribly cool. Positively hip, we are – I’ll give you that one.

    Signalling, like most things we do in protocol design, is about interoperability – it doesn’t really matter what we use, as long as the people you’re trying to communicate with do the same thing. Like all interoperable standards, it shouldn’t be interesting. Really, it should be dull as dishwater, taken for granted by almost everyone almost all the time, like the air you breathe and the socks you wear. I think Emil Ivov said that signalling choices were about as important as residential voltages – he’s right, of course, what matters is not that the voltage in use has to be the same everywhere, just that whatever kit you plug in can handle it.

    1. One minor difference between voltge and signaling – today, I can question the need for an ubiquitous protocol (and give Twitter, Facebook, Skype, Viber, Tango as examples) – but I can’t do the same about an ubiquitous single voltage…

  3. Hi Tsahi

    normally I agree with your opinions, but I have to disagree with this one.

    You’re right signaling protocols don’t matter, they have to be there and do their work. But how do you define what “their work” is? That’s not the easiest thing considering all necessary use cases, scenarios etc. In my opinion “Just do your own JSON + Websockets thingy” can’t be the right answer. In reality things aren’t that easy (IMO). It’s ok to build your techdemo or showcase or whatever, but for professional solutions with real customers involved you will soon come to the point where you have to extend your protocol to meet all your customers needs.

    And in the end after you have gone a long and rocky road (The same all protocols before yours have gone) your protocol will have nearly the same features as all (standard) signaling protocols. As you write yourself “SIP and H.323 are quite the same.” and “(..) I only had to talk about SIP versus XMPP. And again – the differences were rather marginal.”. Yes because the use cases are the same.

    For me its always the same. Don’t do things yourself, if you can pick and extend an existing approach. If not you will just do it different than the ones before you.

    As Newton said: “Standing on the shoulders of giants”

    Looking forward to your comments.


    1. David,

      That’s the issue – the use cases ARE different. H.323 and SIP are trying to be all-encompassing and for most that is too much complexity for the need at hand – just check the interviews I do with people and what they end up using. Mostly it is proprietary. It is usually SIP if it was there before. XMPP from time to time. All the new guys? Proprietary. They just don’t need all the features that SIP has to offer.

      1. The feature they don’t need is “interoperable”.

        If you do need that one, then you need a SIP, Jingle, H323, or something else that other people are able and willing to speak.

          1. I don’t understand why you are willing to sacrifice interoperability from the beginning. We are talking about communication, I think sometimes people forget about that.

            The value one gets from a communication network is directly correlated to the number of network nodes (network effect). So is it useful to waive interoperability? From an economical point of view I would say no.

          2. I am not sacrificing it in any way.

            WebRTC allows more endpoints to be connected (simply because of the deployment of the browser). Once this is in place, the ability to reach out to me on whatever service I prefer is open to anyone – you get the openness and ubiquity without the interoperability aspect.

          3. So I can reach you on the service of your liking. That’s great and hopefully you have a consistent user experience.

            However, as calling party I would like to note the fact that I wanted to reach you (and some other things like call duration) in a system of my own liking. And sorry for being selfish, but I would like to have a consistent user experience for me, no matter who I am calling and what service they prefer.

          4. Selfish is a great trait. Only problem is that you tried to reach me – if you need me so much then do it in my terms please. Same as you would Come to my office.

          5. I think slowly I get the argument here. I have an personal URL (https://go.estos.de/[email protected]) where anyone can reach me with a browser. So your argument is: By using that mechanism anyone is reachable in his preferred network by simply using an browser and an URL? So there is no need for interoperability anymore. That’s undoubtedly true.

            But what’s about the user experience? You force your communication partners to use the network of YOUR choice. If your network would be interoperable to the one of you partner both sides could stick to their preferred user experience. For me it sounds kinda strange if I would have to go to a website first to see if someone is available and to start a conversation (text, audio, video etc.). These problems are solved with federation concepts. Why don’t use them?

          6. You came here to my blog. Found you way and probably you are ok with the experience I am providing here, which is similar, but different that millions of other blogs…

          7. Hmm I normally use feedly to read your blog . I like to have all the content in one place and reducing transaction costs is always welcome. So it is really kind of you to offer this interoperable RSS feed.

      2. Tsahi

        I agree today’s signaling protocols are complex and need specialists to implement them correctly. And yes to build a WebRTC demo you don’t need to use full fledged SIP or JINGLE etc. But as services grow and become professional in some way also the number of features and use cases will grow. So you will have to extend your proprietary protocol to a certain extend and so on. And in the end your protocol is as complex as the existing ones. Maybe not in the short but in the long run.

        In my opinion for most companies it is better to use standards based protocols and concentrate on building great services with supreme user experience than struggling with their own proprietary protocols, which no customer is willing to pay for.


        1. I generally agree with Tsahi on most of this post, notwithstanding the fact that the most interesting WebRTC apps will probably come from people who have no idea of what signalling is at all. They’ll use whatever intermediate platform/API achieves their goals, and they won’t care if it’s ultimately based on SIP, JSON+WS, or a hamster on a wheel.

          “In my opinion for most companies it is better to use standards based protocols and concentrate on building great services with supreme user experience than struggling with their own proprietary protocols, which no customer is willing to pay for.”

          Depends on the specific use-case. In certain instances, there will definitely be value in proprietary protocols. Although if you change the sentence to read “de-jure OR de-facto standards” it’s closer. I imagine we’ll see a few “non-standard standards” emerge here, that are good for certain tasks or certain ecosystems.

          Dean Bubley

        2. David, but how many radical new use communication and collaboration cases really will end up looking like SIP/Jingle/H.323!/whatever at scale? Twitter, Facebook, Instagram, LinkedIn, Pinterest, Chatter, Jive, Box are all doing global “signaling” of some sort at some level of “real time” and their internal global connectivity architectures look nothing like SIP. They also have all sorts of other streaming, searching, liking, following and other semantics that traditional signaling has never dreamt of, and run at a scale that would completely boggle an average enterprise framework. The future will be different from the past.

          1. Lawrence, I do not want to say: “Stick to the traditional approaches till doom”.

            There are two arguments I want to point out:
            1. “Dwarfs standing on the shoulders of giants”. So use the things already there and extend, change, modify them to your needs. If not, everyone will reinvent the wheel in slightly different ways.
            2. Do what your customers pay for. If your mission is creating the best web communication platform in the world, then try to build USPs, which create value for the customer. Signaling is not the thing they pay for, it is just a hygiene factor.

            If new open standards evolve and make the older ones obsolete I am really fine. But advising all developers to build signaling on there own is not meaningful in my opinion it will distract their attention from the real problems.

          2. David, Facebook is the most extreme example but at a billion users, hundreds of petabytes of data and millions of global interactions per second I have to wonder who the giant and who the dwarf is here! The new, young, energized SoCoMo developer ecosystem has more tools available to them than most legacy telecommunications vendors yet understand. I see Tsahi as saying that given all these tools, whether you also need SIP or other traditional signalling is a very open question with new capabilities and communication semantics potentially driving very different decisions!

          3. David – not to toot my horn too much, but more of my thoughts on this dwarfs and giants topic are now published over at WebRTC World (click on my name).

            Philipp – good point, but standards sausage making can often be truly stomach turning in parts. Let’s just hope that the SDP sausage doesn’t protocol-poison all the new enthusiastic SoCoMo kids…

  4. > “who needs interoperability anyway”

    This is probably the saddest statement I’ve read from a technically savvy person for at least a year.

    Aside from the moral aspects however, interoperability is mostly about one thing: division of labour. If I want to take your JavaScript app and use it with my video bridge, presence server, call center software or multi user chat component, then we need to speak a common language.

    For these and other reasons, interoperability isn’t really going anywhere and neither is standard signaling.

    I think the point you are trying to make is that protocols like SIP, XMPP and H.323 will not be well suited for many of the use cases that would emerge with WebRTC. This is probably true and simpler protocols/APIs will often fit better there. However this does not imply that there is no value in standardized signalling.

    The new simple protocols will inevitably converge around similar shapes that will either be RFCs or de facto/du jour standards as Dean mentioned. Either way they are just the same: common, cross app signalling.

    The same goes for SIP and XMPP. There will be applications whose use cases and requirement would fit well with these protocols and people will use them. Complexity isn’t that much of a factor either: because these are standard protocols, there will be libraries for them and their APIs will hide it. Again, it all comes down to division of labour.

    Also, I don’t think that Viber, Facebook, Skype and FaceTime are arguments for how standard signalling is going away. Most of those and many, many other proprietary protocols have been around for quite a while, happily coexisting with SIP and XMPP. A business choice from a handful of big companies who can actually afford to pass on the advantages that come with division of labour, does not mean that everyone else would. Their success actually means exactly the opposite: other players, especially small ones, need to better federate their technologies in order to compete.

    Finally, one should probably take into account the fact that WebRTC is currently being heavily influenced by the traditional SIP community. The recent adoption of the “Unified Plan” actually goes back on WebRTC’s initial choice to be agnostic to signalling and bakes a lot of it (all SIP friendly) into the browsers.

    I would agree is an even sadder thing than the statement that I started with ;).

    1. We had a discussion system with standardized protocols: the Usenet. Yes, it was nice to have a personally customized Usenet client with ‘kill files’ and all that stuff. You could use the same client for all discussion groups, instead of having to re-adjust between bloggeek.me and other web sites. Web sites are not interoperable, they don’t have interconnection and roaming agreements, and the posting of comments has never been standardized. The web won.

      Interoperability / standardized protocols just move from the network to the Javascript API.

      1. Well, this is an interesting analogy. The thing is that in this specific post I wasn’t talking about federation and site interoperability at all … I am just saying … you know, in case we are going for a constructive debate and not just trying to come up with cool things to say 🙂

  5. Complete agree, Tsahi. The web/cloud world is providing all sorts of alternatives for coordinating and sharing information between potential communicating parties so that SIP or other heavier signaling approaches are the last thing on many developer’s minds. For legacy interoperability, yes, someone has to “signal” through to current standards but that’s probably via legacy vendors, SBCs, gateways, or cloud interop (such as Twilio, Blue Jeans or who knows who). For whole new web/cloud use cases and applications “choose whatever works for you” is a fine mantra!

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}