Which WebRTC JS library should I use?

11/02/2019

I don’t really know, but there’s a lot in this innocent “WebRTC JS library” question that isn’t clear without digging a lot further.

Which WebRTC JS library should I use?

Every now and again (= a week or two) I get a question asking me to help with the selection of this or that open source component, pick a CPaaS vendor for a project, find someone to outsource WebRTC work to or hire a stellar WebRTC developer.

Many of these emails are about shortcuts. Give us that silver bullet. Shortcuts seldomly work with WebRTC.

Last week, I had a question come in. A startup is looking for a “WebRTC JS library” to use. Something that does 1:1 voice chat rooms, stores user profiles, etc. It also needed to be inexpensive – Twilio is too expensive for them. And a free alternative was their main preference.

The problem I had with it, is that this simple question of which WebRTC JS library should I use didn’t align that well with the set of questions asked.

This article is about what components are needed for WebRTC deployments. If you’re looking to dig deeper into the media paths in WebRTC, then join my free webinar: Mesh, MCU or SFU

Let’s break down WebRTC to its main components as seen from a network architecture perspective:

  1. Signaling
  2. NAT traversal
  3. Media
  4. Other

Here’s a slide I’ve been using to explain where a device gets connected to in a typical WebRTC session –

Signaling

Signaling

Signaling is how the devices reach out to one another. They can’t do it directly, since they don’t have each other’s IP address, and even if they could, we need some kind of a “protocol” for them to do that.

Signaling in WebRTC is… non-existent. You need to bring your own signaling. This approach confuses some developers, and probably causes this lack of a good solution that fits no-one and everyone at the same time.

Today, you can use SIP, XMPP, MQTT or just proprietary protocols as your signaling for WebRTC traffic. Each such protocol will have its own set of frameworks, services and SDKs that you can use. Some will be free (open source) while others will be licensable software or SaaS based.

NAT traversal

Bridge crossing

NAT traversal is about being able to actually get media flowing.

WebRTC is P2P (peer to peer), meaning you can, in some cases, send media directly across devices. This is something that is impossible otherwise with web browsers. WebRTC also have a preference on using UDP, since it offers better real time low latency characteristics. It is also the only web browser traffic that makes use of UDP, which means it is sometimes blocked as well.

NAT traversal is how WebRTC get past these pesky issues, and it requires additional servers to help it out to do so. Some of these servers (TURN) may end up relaying all traffic through it…

At the end of the day, you will need to deploy these servers or pay for someone to do it for you (no free meals here).

Media

Chatter

Recording. Group calling. The need to control media paths. Broadcasting. All these end up requiring media servers in the backend. Ones that can process media in one way or another.

The most common approaches today is to use SFUs and solve most of the world/media problems with them. These also offer some signaling protocol of their own – my preference is usually to short circuit these and redirect all this traffic through a different signaling/messaging path – especially for the more complex applications.

Again, they come in different shapes, sizes and types – open source ones and commercial ones. You usually won’t be able to pay for them separately as a hosted service and will need to go to a CPaaS vendor to get the whole set of solutions – if you’re looking for the hosted/managed path.

Other

Knicknack

Payments, user authentication and identity, the website itself and a large number of other things you might be needing.

These are really out of scope of WebRTC, but sometimes are provided by the various vendors and frameworks out there.

Back to that question

Ingredients

What were we dealing with to begin with here?

looking for a “WebRTC JS library” to use. Something that does 1:1 voice chat rooms, stores user profiles, etc. It also needed to be inexpensive – Twilio is too expensive for them. And a free alternative was their main preference.

Here’s how I’d break this one down to try and understand what was asked:

  • That “WebRTC JS library” gives a hint of someone searching for a signaling framework. Which is great
  • 1:1 voice chats strengthens that feeling we’re dealing with signaling only
  • The word rooms… that feels more like an SFU media server. In this case, I’ll assume there’s no need for a media server though – due to the price points asked (free), the fact that there’s no ask on recording and that this is a 1:1 scenario
  • Stores user profiles. Hmm. this usually has nothing to do with WebRTC. So much so that most CPaaS vendors don’t offer such a capability either
  • Twilio is about the full shebang – getting a hosted, SaaS, CPaaS, managed (pick the term you like best) solution that gives you signaling, NAT traversal, media and some other knick knacks. Doesn’t quite fit in with the rest of the ask here

When I get such jumbled questions, it feels like there’s a bit of a misunderstanding of what WebRTC is and about how the ecosystem of vendors and services has evolved around it.

Want to learn more about WebRTC?

There are several things to do at this point if you need to grok WebRTC:

  1. Read this article on learning WebRTC for more suggestions
  2. Read my WebRTC for Business People report (it is free)
  3. Learn how I think about WebRTC requirements
  4. Take the first module of my WebRTC training (it’s free)
  5. Join me for the webinar tomorrow – I’ll talk about Mesh, MCU and SFU media architectures

Responses

Philipp Hancke says:
February 11, 2019

So what did you answer?

Reply
Igor says:
February 11, 2019

My answer for SIP is JsSIP, or my JsSIP wrapper webrtcdemo.audiocodes.com/sdk 😉

Reply
    Tsahi Levent-Levi says:
    February 11, 2019

    If you’re using SIP infrastructure, then sure (and I know you guys do at Audiocodes).

    If you’re looking to make this a starting point for tackling the problem of what technology stack you need, then you’re doing it wrong.

    Reply
Gavin Henry says:
February 11, 2019

JsSIP for me, not SIPjs. Proper open source!

Reply
Ochui Princewill says:
February 11, 2019

I stubble on http://www.kurento.org while searching for Webrtc framework

Reply
    Tsahi Levent-Levi says:
    February 12, 2019

    Ochui – thanks for sharing.

    I think this is where a lot of the confusion lies – Kurento is a media server framework. It does has some rudimentary signaling of its own, but I wouldn’t pick it for signaling.

    What you need in your feature set greatly affect what frameworks and projects you should use.

    Reply
Silvia Pfeiffer says:
February 12, 2019

I think you might have missed what they are looking for. They are looking for a solution that they can just drop into their application and that provides video or voice chat without them having to do any deep development work and without needing to understand WebRTC.

There’s two solutions: get a company like WebRTC Ventures ti develop and maintain that component for you – or use something like our Coviu API to embed video rooms into your app.

Coviu is an embedding based solution but unlike YouTube, all the hard server work associated with WebRTC is done for you. Yes, we’re a telehealth company, but our video interface can be used without medical tools, custom branded and embedded in other apps through an API.

Reply
    Tsahi Levent-Levi says:
    February 12, 2019

    Silvia,

    Thanks – some people do want such a thing, but in many cases, they end up unhappy as they have some specific requirements/needs that aren’t met. There’s a lot of variety out there in what people want and mean when they say something like “WebRTC JS library”.

    Reply

Comment