10 Tips for Choosing the Right WebRTC Open Source Media Server Framework

By Tsahi Levent-Levi

June 19, 2017

Too many WebRTC open source projects. Not enough good ones.

There are many shapes and sizes of WebRTC servers. In this article, I want to focus on WebRTC media servers, and in this category, the open source alternatives.

Ever went to github to search for something you needed for your WebRTC project? Great. Today, there’s almost as many WebRTC github projects as there are WebRTC LinkedIn profiles.

Some of these code repositories really are popular and useful while others? Less so.

Here’s the most glaring example for me –

When you just search for WebRTC on github, and let it select the “Best match” by default for you, you’ll get PubNub’s sample of using PubNub as your signaling for a simple 1:1 video call using WebRTC. And here’s the funny thing – it doesn’t even work any longer. Because it uses an old PubNub WebRTC SDK. And that’s for an area that requires less of an effort from you anyway (signaling).

You can read more about this in my open source WebRTC media server github comparison.

Let’s assume you actually did find a WebRTC open source media server that you like (on github of course). How do you know that it is any good? Here are 10 different signals (not WebRTC ones) that you can use to make that decision.

Need to pick a WebRTC media server framework? Why not use my Free Media Server Framework Selection Worksheet when checking your alternatives?

1. Do You Grok the Code?

If you are going to adopt an open source media server for your WebRTC project then expect to need to dive into the code every once in awhile.

This is something you’ll have to do either to get the darn thing to work, fix a bug, tweak a setting or even write the functionality you need in a plugin/add-on/extension or whatever name that media server uses for making it work.

In many of the cases I see when vendors rely on an open source WebRTC media server framework, they end up having to dig in its code, sometimes to the point of taking full ownership of it and forking it from its baseline forever.

To make sure you’re making the right decision – check first that you understand what you’re getting yourself into and try to grok the code first.

My own personal preference would be a code that has comments in it (I know I have high standards). I just can’t get myself behind the notion of code that explains itself (it never does). So be sure to check if the non-obvious parts (think bandwidth estimation) are commented properly while you’re at it.

2. Is the Code Fresh?

Apple just landed with WebRTC. And yes. We’re all happyyyyy.

But now we all need to shift out focus to H.264. Including that WebRTC media server you’ve been planning to use.

Oh – and Google? They just announced they will be migrating slowly from Plan B to Unified Plan. Don’t worry about the details – it just changes the way multiple streams are handled. Affecting most group calling implementations out there.

And there was that getstats() API change recently, since the draft specification of WebRTC finally decided on the correct definition of it, which was different than its Chrome implementation.

The end result?

Code written a year or two ago have serious issues in actually running.

Without even talking about upgrades, updates, security patches, etc.

Just baseline “make it work” stuff.

When you check that github page of the WebRTC media server you plan on adopting – make sure to look when it was last updated. Bonus points if you check what that update was about and how frequently updates take place.

3. Anyone Using It?

Nothing like making the same mistakes others are making.

Err… wrong expression.

What you want is a popular open source WebRTC media server. Anything else has a reason, and that reason won’t be that you found a diamond in the rough.

Go for a popular framework. One that is battle tested. Preferably used by big names. In production. Inside commercial products.

Why? Because it gives you two things you need:

Validation
Ecosystem

It gives you validation that this thing is worth something – others have already used it.

And it gives you an ecosystem of knowledge and experience around it. This can be leveraged sometimes for finding some freelancers who have already used it or to get assistance from more people in the “community”.

I wouldn’t pick a platform only based on popularity, but I would use it as a strong signal.

4. Is This Thing Documented?

What you get in a media framework for WebRTC is a sort of an engine. Something that has to be connected to your whole app. To do that, you need to integrate with it somehow – either through its APIs when you link on it – or a lot more commonly these days by REST APIs (or GraphQL or whatever). You might need both if you plan on writing your own addon/extension to it.

And to do that, you need to know the interface that was designed especially for you.

Which means something written. Documentation.

When it comes to open source frameworks, documentation is not guaranteed to be of specific quality – it will vary greatly from one project to another.

Just make sure the one you’re using is documented to a level that makes it… understandable.

If possible, check that the documentation includes:

Some introductory material to the makeup and architecture of the project
An API reference
A few demos and examples of how to use this thing
Some information about installation, configuration, maintenance, scaling, …

The more documentation the better off you will be a year down the road.

5. Is It Debuggable?

WebRTC is real time and real time is hard to debug.

It gets harder still if what you need to look at isn’t the signaling part but rather the media part.

I know you just LOVE adding your own printf and cout statements in your C++ code and try reproducing that nagging bug. Or better yet – start collecting PCAP files and… err… read them.

It would be nice though if some of that logging, debugging, etc would be available without you having to always add these statements into the code. Maybe even have a mechanism in place with different log levels – and have sensible information written into the logs so the time you’ll need to invest in finding bugs will be reduced.

Also – make sure it is easy to plug a monitoring system to the “application” metrics of the media server you are going to use. If there is no easy way to test the health of this thing in production, it is either not used in production or requires some modifications to get there. You don’t want to be in that predicament.

While at it – make sure the code itself is well documented. There’s nothing as frustrating (and as stupid) as the self explanatory code notion. People – code can’t explain itself – it does what it does. I know that the person who wrote that media server was god incarnate, but the rest of us aren’t. Your programmers are excellent, but trust me – not that good. Pick something maintainable. Something that is self explanatory because someone took the time to write some damn good comments in the tricky parts of the code. I know this one is also part of grokking the code, but it is that important – check it twice.

For me, the ability to debug, troubleshoot and find issues faster is usually a critical one for something that is going to get into my own business. Just a thought.

6. Does it Scale?

Media servers are resource hogs (check this video mini series for a quick explanation).

This means that in most likelihood, if your business will become successful in any way, you will need more than a single media server to run at “scale”.

You can always crank it up on Amazon AWS from m4.large up to m4.16xlarge, but then what’s next?

At the end of the day, scaling of media servers comes down to adding more machines. Which is simple enough until you start dealing with group calls.

Here’s an example.

Assume a single machine can handle 100 participants, split into any group type (I am simplifying here)
And we have 10 participants on average in each call
Each group call can have 2 participants, up to… 50 participants

Now… how do we scale this thing?

How many machines do we need to put out there? When do we decide that we don’t add new calls into a running machine? Do we cascade these machines when they are fully booked? Throw out calls and try to shift them to other machines?

I don’t know, but you should. At least if you’re serious about your product.

You probably won’t find the answers to this in the open source WebRTC media server’s documentation, but you should at least make sure you have some reasonable documentation of how to run it at scale, and not as a one-off instance only.

7. What Languages Does it Use?

They wrote it in Kotlin.

Because that’s the greatest language ever. Here. Even Google just made it an official one for Android.

What can go wrong?

Two things I look for in the language being used:

That the end result will be highly performant. Remember it’s a resource hog we’re dealing with here, so better start with something that is even remotely optimized
That I know how to use and have it covered by my developers

Some of these things are Node.js based while other are written in Java. Which one are your developers more comfortable with? Which one fits better with your technology plans for your company in the next 5 years?

If you need to make a decision between two media servers and the main difference between them for you is the language – then go with the one that works better for your organization.

8. Does It Fit Your Signaling Paradigm?

Three things you need in your WebRTC product:

Signaling server
STUN/TURN server
Media server

That 3rd one isn’t mandatory, but it is if you’re reading this.

That media server needs to interact with the signaling server and the STUN/TURN server.

Sure. you can use the signaling capabilities of the media server, but they aren’t really meant for that, and my own suggestion is not to put the media server publicly out there for everything – have it controlled internally in your service. It just doesn’t make architectural sense for me.

So you’ll need to have it interacting with your signaling server. Check that they both share similar paradigms and notions, otherwise, this in itself is going to be quite a headache.

While at it – check that there’s easy integration between the media server you’re selecting and the STUN/TURN server that you’ve decided to use. This one is usually simple enough, but just make sure you’re not in for a surprise here.

9. Is the License Right For You?

BSD? MIT? APL? GPL? AGPL?

What license does this open source WebRTC media server framework comes with exactly?

Interestingly, some projects switch their license somewhere along the way. Jitsi did it after its acquisition by Atlassian – moving from LGPL to a more permissive APL.

The way your business model looks like, and the way you plan to deploy the service are going to affect the types of open source licenses you will want and will be able to adopt inside your service.

There are different types of free when it comes to open source licenses.

Every piece of code you pick and use needs to adhere to these internal requirements and constraints that you have, and that also includes the media server. Since the media server isn’t something you’ll be replacing in and out of place like a used battery, my suggestion is to pick something that comes with a permissive open source license – or go for a commercial product instead (it will cost you, but will solve this nagging issue).

I’ve written about open source licenses before – check it out as well.

10. Is Anyone Offering Paid Support For It?

Yes. I know.

You’re using open source to avoid paying anyone.

And yet. If you check many of the successful and well maintained open source projects – especially the small and mid-sized ones – you will see a business model behind them. A way for those who invest their time in it to make a living. In many cases, that business model is in support and custom development.

Having paid support as an option means two things:

Someone is willing to take ownership and improve this thing and he is doing it as a day job – and not as a hobby
If you’ll need help ASAP – you can always pay to get it

If no one is offering any paid support, then who is maintaining this code and to what end? What would encourage them to continue with this investment? Will they just decide to stop investing in it next month? Last month? Next year?

Making the Decision

I am not sure about you, but I love making decisions. I really do.

Taking in the requirements and the constraints, understanding that there’s always unknowns and partial information. And from there distill a decision. A decisive selection of what to go with.

You can find more technical information on media servers in this great compilation made by Gustavo Garcia.

After you take the functional requirements that you have, and find a few suitable open source WebRTC media frameworks that can fit these requirements, go over this list.

See how each of them addresses the points raised. It should help you get to the answer you are seeking about which framework to go with.

Oh, and remember that media servers are just one piece of the puzzle of the WebRTC servers you will need in your application.

Towards that goal, I also created a Media Server Framework Selection Sheet. Use it when the need comes to select an open source WebRTC media server framework for your project.

Choosing the best WebRTC signaling protocol for your application

WebRTC is about reducing friction and barriers of entry

Ronan Leonard says:

June 29, 2017 at 6:14 am

Thanks Tsahi,

I’ve been struggling with the implementation of webRTC for our mastermind platform http://www.eccountability.io and this really helps.

Reply
mekya says:

July 1, 2017 at 2:40 pm

Another open source media server -> Ant Media Server
http://antmedia.io

Reply
1. Tsahi Levent-Levi says:
  
  July 1, 2017 at 10:27 pm
  
  Thanks. Bumped into your project about a month ago.
  
  Reply
RalucaGrig says:

July 12, 2017 at 3:09 pm

Great article! Thank you for sharing.
I have encountered lots of open libraries (wowza, red5), but one stuck with me because of its interactive interface and easy-to-use design. I am referring to Streamaxia’s OpenSDK (both for iOS and Android: https://www.streamaxia.com/opensdk-ios-rtmp-library/), an impressive live streaming library! You should definitely check it out.
Happy streaming!

Reply
1. Tsahi Levent-Levi says:
  
  July 24, 2017 at 12:23 pm
  
  Thanks for sharing.
  
  I’ll see if there’s room to add Streamaxia in my next version of this landscape.
  
  Reply
Heyan says:

December 8, 2017 at 5:03 pm

I think the Kurento could be the best choice for WebRTC media server

Reply
1. Tsahi Levent-Levi says:
  
  December 10, 2017 at 8:13 am
  
  There has been very little activity around Kurento in the past year. My suggestion these days is usually to stay clear of Kurento. It almost made no change in its code in the last year and WebRTC implementations in browsers have considerably changed during that same time.
  
  Reply
  1. tiger says:
    
    December 28, 2017 at 4:07 am
    
    Do you have any suggestion?
    
    Reply
    1. Tsahi Levent-Levi says:
      
      December 28, 2017 at 6:28 am
      
      These days, my suggestion is to choose between either Jitsi or Janus.
      
      Reply
  2. Gopal says:
    
    July 3, 2018 at 12:44 pm
    
    You should look out for OpenVidu server, that I think the same Kurento team has built on top of Kurento stack. It incorporates the support for the latest versions of most browsers including support on iOS and Android
    
    Reply
    1. Tsahi Levent-Levi says:
      
      July 3, 2018 at 5:30 pm
      
      Thanks for sharing Gopal. It is still rather new, so until I hear more about it from the discussions I have with companies, I am staying somewhat hesitant.
      
      Reply
      1. Moyeen says:
        
        July 28, 2018 at 6:13 pm
        
        so what do you think we are using openvidu actually. so is jitsi or janus good. And what its android implementation?
      2. Tsahi Levent-Levi says:
        
        July 29, 2018 at 12:37 pm
        
        Moyeen,
        
        OpenVidu is a layer implemented on top of Kurento. It is still early days to know where is it going and how popular it will be.
Daiyrbek says:

February 12, 2020 at 4:58 pm

Are there any paid services so I can trust all that stuff to them and pay monthly or something?

Reply
1. Tsahi Levent-Levi says:
  
  February 13, 2020 at 4:33 pm
  
  That would be CPaaS vendors. Look at these alternatives: https://bloggeek.me/webrtc-paas-report/
  
  Reply
Paul Cheng says:

March 16, 2020 at 2:21 pm

Where can I rent a Stun/Turn server for my WebRTC project? Regards.

Reply
1. Tsahi Levent-Levi says:
  
  March 16, 2020 at 4:31 pm
  
  XirSys and Twilio offer these:
  
  * https://www.twilio.com/stun-turn
  * https://xirsys.com/
  
  Reply
Rosen says:

April 7, 2020 at 10:08 pm

Many thanks for the article, do you have the same suggestion these days (Jitsi or Janus)?

Reply
1. Tsahi Levent-Levi says:
  
  April 7, 2020 at 11:48 pm
  
  It ends up depending a lot on the use case. Both are fine alternatives. I’d throw mediasoup into that same mix while at it.
  
  Reply
Rosen says:

April 8, 2020 at 12:25 am

Many thanks for your quick reply. We are planning to quickly develop a TeleHealth application (Covid 19 context). do you have any suggestion for this use case?

Reply
1. Tsahi Levent-Levi says:
  
  April 8, 2020 at 2:25 pm
  
  Obviously, that is not nearly enough to give a specific answer. All alternatives are fine at that level of granularity.
  
  Pasting your requirements document here won’t help, so no need trying 😉
  
  Reply
Mohammad Arsalan says:

April 13, 2020 at 5:59 pm

I have an existing application using webRTC for video calls. Now I want to add video recording. Can you suggest me the best media server?

Reply
1. Tsahi Levent-Levi says:
  
  April 13, 2020 at 6:03 pm
  
  Mohammad,
  
  Any of the popular ones will work. You should also check the client-side recording alternative. Read this for more information: https://bloggeek.me/recording-webrtc-sessions/
  
  Reply
Rushan Vijayanga says:

April 18, 2020 at 10:26 am

HI we are looking for a opensource project based on webrtc to extract features of video conference ( group call ) and screen sharing . And its with android, IOS and web support. Please suggest a suitable project. Thank you

Reply
1. Tsahi Levent-Levi says:
  
  April 18, 2020 at 2:09 pm
  
  Rushan,
  
  There’s no one size fits all and the requirements you’ve listed aren’t specific enough or well defined. This doesn’t allow anyone to say what is the most suitable solution for you.
  
  Writing back and forth here on the comments adding a few features won’t really help either. Your best bet is to learn and understand WebRTC to a level that allows you to make an informed decision (there’s enough material both free and paid here and elsewhere to get you going with it) or to hire someone to help you out with that decision.
  
  Reply
raheem says:

May 4, 2020 at 10:33 am

Hi Tsahi,

We have implemented, video conference application using RTCmulticonnection library which use core webrtc components, it’s working between browsers, ios and andriod clients, now we are planning to add media server for scaling purpose, could you please suggest a better media server which can use our existing signalling process, thanks for your great support, i really appreciate you.

Reply
1. Tsahi Levent-Levi says:
  
  May 7, 2020 at 11:02 am
  
  To the best of my knowledge, mediasoup is the easiest to use with an existing signaling protocol/application in place.
  
  Janus shouldn’t be too hard to integrate with either.
  
  Reply
Aidar says:

May 12, 2020 at 9:02 am

Hi Tsahi,
Thanks for your interesting posts,
what can you say about medooze media server, do you have experience with it?
https://github.com/medooze/media-server

Reply
1. Tsahi Levent-Levi says:
  
  May 12, 2020 at 9:12 am
  
  Aidar,
  
  I have no experience with Medooze. From what I understand by checking around is that there’s a high learning curve for using Medooze. CoSMo Software use it in some of the projects they go into, probably because the original lead developer of Medooze works with/at CoSMo. Finding developers who are experienced with it and can maintain it later on is something I haven’t seen done.
  
  This doesn’t mean that it can’t be done or even that it is hard to do – just that I haven’t had a conversation with anyone who did.
  
  Reply
Cindy says:

May 25, 2020 at 7:16 pm

Hi Tsahi, thanks for the insightful article! We’re currently looking at BigBlueButton because its moderation ability stands out. What is your view on this open-source project?

Reply
1. Tsahi Levent-Levi says:
  
  May 25, 2020 at 7:22 pm
  
  Cindy,
  
  I’ve got no solid opinion about it (good or bad). Haven’t seen many who use it to develop use cases. Maybe a handful that just deploy it on a small installation.
  
  Reply