Struggling with WebRTC POC or demo development? Follow these best practices to save time and increase the success of your project.
I get approached by a lot of startups and developers who start on the path to building WebRTC applications. Oftentimes, they reach out to me when they can’t get their POC (Proof of Concept) or demo to work properly.
For those who don’t want to go through paid consulting, here are some best practices that can save you time and can considerably increase the success rate of your project.
Table of contents
Media servers and WebRTC. What can possibly go wrong?
I don’t want to delve here too much on peer to peer type solutions. These require no media server and due to that are “easier” to develop into a nice demo. The services that use media servers are the ones that are often more beefy and are also the ones that fall into many challenging traps during a POC development.
Media requires the use of ephemeral ports that get allocated dynamically. It needs to negotiate connections. There are more moving parts that can break and fail on you.
All of the following sections here include best practices that you should read before going on to implement your WebRTC demo. Best to use them during your design and planning phases.
An introduction to WebRTC media servers
Use CPaaS
Let’s start with the most important question of all. If you’ve decided to install and host media servers in AWS or other locations – are you sure this is an important part of your demo?
I’ll try to explain this question. A demo or a POC comes to prove a point. It can be something like “we want to validate the technical viability of the project” or “we wanted to have something up and running quickly to start getting real customers’ feedback”.
If what you want is to build an MVP (Minimal Viable Product) with the intent of attracting a few friendly customers, go to a VC for funding or just test the waters before plunging in, then be sure to do that using CPaaS or a Programmable Video solution. These are usually based on usage pricing so they won’t be expensive when you’re just starting out. But they will reduce a lot of the headaches in development and maintenance of the infrastructure – so they’re more than worth it.
Sometimes, what you will be after is a POC that seeks to answer the question “what does it mean to build this on our own”. Not only due to costs but mainly due to the uniqueness of the requirements desired – these may include the need to run in a closed network, connect to certain restricted components, etc. Here, having the POC not use CPaaS and rely on open source self hosted components will make perfect sense.
First have the “official” media server demo work
Decided not to use CPaaS? Picked a few open source media servers and components that you’ll be using?
Make sure to install, run and validate the demo application of that open source media server.
You should do this because:
- You need to know the open source component actually works as advertised
- It will get you acquainted with that component – its build system, documentation, etc
- Taking things one step at a time, which is discussed later on
Using a 3rd party? Install and run its demo first.
Don’t. Use. Docker
Docker is great. Especially in production. Well… that’s what I’ve been told by DevOps people. It makes deploying easier. It is great for continuous integration. It is fairy dust on the code developers write.
But for WebRTC media servers? It is hell on earth to get configured properly for the first time. Too many ports need to be opened all over the place. Some TCP. Lots of them UDP. And if you miss the configuration – the media won’t get connected. Or it will. Sometimes. Which is worse.
My suggestion? Leave all the DevOps fairy dust for production. For your POC and demo? Go with operating systems on virtual machines or on bare metal. This will save you a lot of headaches by making sure things will fail less due to not having ports opened properly on your Docker configuration(s).
You don’t have time to waste when you’re developing that WebRTC POC.
Don’t do native. Go web
Remember that suggestion about doing the full session for your demo so you know the infrastructure is built properly? If you need native applications on mobile devices – don’t.
The easiest way to develop a demo for WebRTC would be by using a web browser for the client side. I’d go farther and say by using Chrome web browser. Ignore Firefox and Safari for the initial POC. Skip mobile – assume these are a lot of work but won’t validate anything architecturally. At least not for the majority of application types.
Still need to go native and mobile? Here are your WebRTC mobile SDK alternatives
Use a 3rd party TURN service
Always always always configure TURN in your iceServers for the peer connections.
Your initial “hello world” moment is likely to take place on the local LAN or even on the same machine. But once you start placing the devices on different networks, things will start failing without TURN servers. To make sure you don’t get there, just have TURN configured.
And have it configured properly.
And don’t install and host your own TURN servers.
Just use a managed TURN service.
The ones I’d pick for this task are either Twilio or Cloudflare for this stage. They are easy to start with.
You can always replace them with your own later without any vendor lock-in risk. But starting off with your own is too much work and hassle and will bring with it a slew of potential bugs and blockers that you just don’t need at this point in time.
More on NAT Traversal and TURN servers in WebRTC
Be very specific about your requirements (and “demo” them)
Don’t assume that connecting a single user to a meeting room in a demo application means you can connect 20 users into that meeting room.
Streaming a webcam to a viewer isn’t the same as streaming that same webcam to 100 viewers.
If you plan on doing a real proof of concept, be sure to define the exact media requirements you have and to implement them at the scale of a different session. Not doing so means you aren’t really validating anything in your architecture.
A 1:1 meeting uses a different architecture than a 4-way video meeting which in turn uses a different architecture than a 20-50 participants in a meeting, which is different once you think about 100 or 200 participants, which again looks different architecturally when you’re hitting 1,000-10,000 and then… you get the point on how to continue from here.
The same applies for things like using screen sharing, doing spatial audio, multiple video sharing, etc. Have all these as part of your POC. It can be clunky and kinda ugly, but it needs to be there. You must have an understanding of if and how it works – of what are the limits you are bound to hit with it.
For the larger and more complex applications, be sure you know all of the suggestions in this article before coming to read it. If you don’t, then you should beef up your understanding and experience with WebRTC infrastructure and architecture…
Got a POC? Build it to scale for that single session you’re aiming for. I won’t care if you can do 2 of these in parallel or a 1,000. That’s also important, but can wait for later stages.
More on scaling WebRTC meeting sizes
One step at a time
Setting up a WebRTC POC is a daunting task. There are multiple moving parts in there, each with its own quirks. If one thing goes wrong, nothing works.
This is true for all development projects, but it is a lot more relevant and apparent in WebRTC development projects. When you start these exploration steps with putting up a POC or a demo, there is a lot to get done right. Configurations, ports, servers, clients, communication channels.
Taking multiple installation or configuration steps at once will likely end up with a failure due to a bug in one of these steps. Tracing back to figure out what was the change causing this failure will take quite some time, leading to delays and frustrations. Better to take one step at a time. Validating each time that the step taken worked as expected.
I earned that the hard way at the age of 22, while being the lead integrator of an important project the company I worked for had with Cisco and HP. I blamed a change that HP did on an issue we had with our VoIP implementation that lost us a full week. It ended up me… doing two steps instead of one. But that’s a story for another time.
Know your tooling
If you don’t know what webrtc-internals is and haven’t used dump-importer then you’re doing it wrong.
Not using these tools mean that when things go wrong (and they will), you’re going to be totally blind on why. These aren’t perfect tools, but they give you a lot of power and visibility that you wouldn’t have otherwise.
Here’s how you download a webrtc internals file:
You’ll need to do that if you want to view the results on fippo’s webrtc-dump-importer.
And if you’re serious about it, then you can read a bit about what the WebRTC statistics there really mean.
Now if you’re going to do this properly and with a budget, I can suggest using testRTC for both testing and monitoring.
Know more about WebRTC
Everything above will get you started. You’ll be able to get to a workable POC or demo. Is that fit for production? What will be missing there? Is the architecture selected the one that will work for you? How do you scale this properly?
You can read about it online or even ask ChatGPT as you go along. The thing is that a shallow understanding of WebRTC isn’t advisable here. Which is a nice segway to say that you should look at our WebRTC courses if you want to dig deeper into WebRTC and become skilled with using it.