Common (beginner) mistakes in WebRTC

12/08/2019

WebRTC can be hacked-away with great results. Often though, this leads to sub-par experiences.

WebRTC as a VoIP technology is the best thing ever. It “democratizes” this whole domain, taking it from the hands of experts into the hands of the masses of developers out there. Slapping a bit of code and seeing real time video is magical. And we’re now starting to add it to more and more businesses using web technology.

While this all seems easy now (and it is a lot easier than it used to be before WebRTC), there are a few mistakes that many beginners make in WebRTC. And to be honest, these mistakes are not only made by beginners. That is why I am sharing a couple of common (beginner) mistakes in WebRTC that I’ve seen for a couple of years now.

1. Using an outdated signaling server (from github)

This happens all too often. You start by wanting to build something, you search github, you pick a project, and with WebRTC – it just doesn’t work. It might for the simple scenario but it won’t handle edge cases, or scale nicely, or accomodate for the more complex thing you’re thinking about.

The truth is, that today, there’s no single, goodly, off the shelf, out of the box, readymade, pure goodness, open source, signaling server for WebRTC that you can use. Sorry.

There might be a few contenders, but I haven’t seen any specific project that everyone’s using (unlike TURN for example, where coturn definitely rulz). The sadder truth? SFUs offer better signaling than signaling servers with WebRTC (and almost always I’d suggest against using their signaling directly in front of your WebRTC client).

2. Mis-configuring NAT traversal

This should have been trivial by now, but apparently it isn’t.

A few rules of thumb:

  1. Don’t. Rely. On. Google. Public. STUN
  2. Don’t use free github STUN and TURN server lists
  3. Don’t decide not to deploy TURN because your server has a public IP address
  4. And then a few

This is such a basic and common mistake that I even created a free video course for it: Effectively Connecting WebRTC Sessions.

3. Testing locally

This one’s also basic.

Locally things tend to work well. Due to different network configuration, but also due to fairy dust that I am sure you sprinkle over your local router (I know I do every morning).

Once you go to the real world with real networks, things tend to break.

Test in the real world and not on your machine using 2 tabs, or being professional, from a Chrome tab to a Firefox tab.

The real world is messy and messy isn’t healthy for naive deployments.

Need help with automating that? Look at testRTC, but don’t neglect real world testing.

4. Not using adapter.js

WebRTC is a great specification but it is rather new.

This means that:

  • Different browsers are going to behave a bit differently
  • Browser implementations are somewhat buggy
  • Different versions of the same browser act differently

And I haven’t even started about getting WebRTC browser implementations to be spec compliant with 1.0.

This all boils down to you having to work out a strategy in your code where all that if/then directives to deal with these differences takes place.

The alternatives?

  1. Whenever you find such an issue, add that if/then statement in the code directly (the most common approach, albeit not really smart in the long term)
  2. Create a shim/polyfill/whatever you want to call it, where you do all these if/then thingies (great, but not easy to maintain)
  3. Just use adapter.js

Guess which one I prefer?

5. Forgetting to take care of security

Two reasons for you to forget about security:

  1. WebRTC is secure, so why should you do anything more about it?
  2. Because your service doesn’t deal with payments or sensitive data so why bother?

Both reasons are won’t lead you to a good place. In 2019, security is coming to the front, especially with communications. You can ask Zoom about it, or go check what Google’s Project Zero did recently.

Want a good starting point? I’ve got a WebRTC security checklist for you.

6. Assuming you can outsource it all

You can’t. Not really.

Need a design for a whitepaper? An article written? A WordPress website created? Find someone on Upwork, Fiverr or the slew of other websites out there and be done with it.

With WebRTC? Don’t even think about it.

WebRTC is ever-changing, which means that whatever you deploy today, you will need to maintain later. If you are outsourcing the work – some of it or all of it – assume this is going to be a long term relationship, and that for the most part, you may be able to outsource the development work but not the responsibility.

Going this route? Here are 6 things to ask yourself before hiring an outsourcing WebRTC vendor.

7. Diving into the code before grokking WebRTC

  1. Go to github.
  2. Pick a project.
  3. “Install” it.
  4. Run it.
  5. Fix a few lines of code.
  6. Assume you’re done.

No. WebRTC is much more complicated than that scenario above. There are a few different servers you’ll need to deploy and use, there’s geography sensitivity to consider, and lots of other things.

You need to understand WebRTC if you want to really use it properly. Even if all you’re doing is using a 3rd party.

Don’t make these mistakes!

Be sure to review these to see if there’s anything you’re doin’ wrong:

  1. Using an outdated signaling server (from github)
  2. Mis-configuring NAT traversal
  3. Testing locally
  4. Not using adapter.js
  5. Forgetting to take care of security
  6. Assuming you can outsource it all
  7. Diving into the code before grokking WebRTC

Check out my free WebRTC Basics course, or the bigger Advanced WebRTC one.

Responses

Nir Simionovich says:
August 12, 2019

Very much like VoIP and Telephony in the old days, WebRTC isn’t a simple technology – nor something a “web-developer” can simply slap on some JavaScript code and make it work. Yes, it’s fairly easy to get something to work locally, but building for scale is a different story all together.

The main thing that people tend to forget here is that WebRTC is a whole lot of different things and technologies – and getting them to work the way you want isn’t a trivial task.

One thing I don’t agree with your writing is the following: “It “democratizes” this whole domain, taking it from the hands of experts into the hands of the masses of developers out there. Slapping a bit of code and seeing real time video is magical.” – I find that statement a little misleading.

If you would like to define “democratize” as: making it accessible – I agree. It doesn’t mean that it is as simple as writing a few lines of JavaScript. Adding voice and video to applications at ease is magical, indeed. But there is a whole lot of hard work behind it. If you go to our website (https://cloudonix.io), you will see at the bottom left side a “handset” icon. Click-it and following a short entry form, you will be connected directly to one of us. That “widget” seems incredibly simple, but, behind it – there is much “hard work”. Developer’s can use platforms like Cloudonix, Twilio, Nexmo and others to create various WebRTC based solutions – but they will be different. While WebRTC democratized the access to voice and video capabilities, it created new opportunities for vendors and service providers.

In general, I agree – WebRTC is not a place for script kiddies or ‘fly by night coders’, it’s a serious piece of technology that requires much care.

Reply
    Tsahi Levent-Levi says:
    August 12, 2019

    Nir,

    Thanks for that.

    I truly think WebRTC democratizes communication and takes it out of the hands of VoIP engineers. Many of the most interesting use cases I’ve seen were built because no VoIP engineer was involved in them.

    Knowing the technology and how to use it is different than what background is needed.

    I hope that clears what I tried to convey with “democratize” (which probably means accessible, so I guess we’re in agreement here).

    Reply
Minh Tri says:
September 7, 2019

I am new to WebRTC. I’m now trying to build a livestream web using WebRTC. Learning material about WebRTC on the Internet is too much but I can’t find any of these has overview architechture of WebRTC when implemeting in practical.

As I discovered, I know some servers may include in WebRTC app:

– Signaling server for mutual understanding between peers.
– STUN server for exchange IP address between peers when NAT appears.
– TURN server is like a relay server that makes transitions of streams to peer in central server.
– ICE server helps peer find the way (using signaling or STUN, or TURN) to get information of other peers.

Media server for processing media stream (encode, decode, multiplexing, or image processing) in server and output the processed stream to peers.
So to have a good livestream (low latency, reasonable price for infrastructure) system using WebRTC. What things do I need in system:

– Just signaling, turn, stun and ice servers.

– Or Both of them and media server (like Kurento).

– Or just media server only.

What’s the best choice for 1:1 (like Facetime) and 1:many (like Youtube livestreaming)?

And do I understand it right?

Thank you!

Reply
    Tsahi Levent-Levi says:
    September 7, 2019

    Minh,

    A few comments:
    * There’s no ICE server. There’s an ICE protocol
    * STUN is usually “packaged” with a TURN server (although not always)

    For 1:1 you don’t need a media server. You’ll need STUN, TURN and a signaling server
    For 1:many you will also need media servers

    Reply
Minh Tri says:
September 9, 2019

Thank you for your reply.
I have other 2 questions:
1/ When using STUN, TURN and signaling, what are factors that directly affect to the quality and the latency of the call? Are they bandwidth of each, server load capability, or something else? And what are suitable solution for solving each problem?

2/ Is it necessary to use media server in 1:1 video call to ensure the quality of the call? (I’m afraid it’s something low quality without media server).

Reply

Comment