Different Requirements of Scaling real time video

By Tsahi Levent-Levi

February 22, 2016

There’s scaling and then there’s scaling.

The post from last week about the future of WebRTC live broadcast left some interesting impressions. Comments on that post and in Facebook. Red5 even did a follow up post on it.

One thing that was missing from these comments is an understanding of what scale means. Or rather the different types of scaling that are required when it comes to real time video.

Here are a few different aspects of scaling real time video.

#1 – Streams per machine

This is something that was raised on one of the comments on Facebook:

Most of the SFUs out there can actually handle 100’s and even 1000’s of connections (our data is not public but look at JVB:https://jitsi.org/Projects/JitsiVideobridgePerformance) and with most of them it should be possible without much effort to configure multiple SFUs in cascade to scale almost without any limit in my opinion.

That answers the question how many parallel sessions can you conduct on a single machine?

What is this one good for?

When you know how many sessions / streams you plan on having, you can then calculate how many machines you’ll need to run that scenario. From there, it is easier to extrapolate costs.

But that’s not our only vector of scale.

#2 – Streams per session

How many streams can we “bundle” per session?

In the comment above, what was failed to be mentioned was that these tests of 100’s and 100’s of connections were when each session had no more than 33 streams in it. So if what I want is to live broadcast a singer to 1000’s of viewers in real time – this SFU solution won’t be suitable for my need.

It is nice to be able to do multiparty video or to broadcast live with low latency, but always ask yourself – what’s the upper limit here for this single session? How many participants can I cram into that session without making things impossible on my infrastructure?

There are, in general, two critical challenges here:

When the number of users per session grows, the amount of communications between peers should be limited. At the extreme, a broadcaster should not be harassed by viewers directly (which is wher e the SFU starts breaking at scale and why I assume Jitsi preferred not to check above 33 participants)
When the number of users per session grows beyond a single machine, how does that compute? You’ll need to be able to distribute the session somehow either by cascading or using some other means of architectural magic

It is also worth pointing out that the larger the group, the more fragmentation issues you’ll have across parallel sessions – if the size of a session is dynamic, then on what kind of a machine should you start it? One which is free or one which is already somewhat busy? Can you dynamically route a session to other machines when the need arise? How do you load balance this?

#3 – Failure diffusion

This one is related because the higher the scale and capacity, the more of an issue this will be.

Let’s assume we can get a machine to run 10,000 streams in parallel. I am optimistic today. Let’s also assume that this all happens in a single process running in our machine.

What happens if there’s a bug somewhere (and believe me – there already is), which happen to cause the system to crash? Whenever we hit the bug, 10,000 streams get disconnected.

Now let’s further assume that each session holds 10 streams on average. And the bug was invoked due to one of these streams doing something slightly unorthodox. Now we have one session causing the disconnection of 999 more sessions on that machine.

Which leads us to the question –

Can I run multiple processes on the same machine, each catering a smaller number of sessions? Maybe even only a single session? How does that impact memory and performance? Is it even desirable?

For some, this might be necessary in their architecture – and it is very far from how telecom services are architected…

When Talking About Scaling…

Make sure you refer to the specific aspects you wish to scale.

Need to pick an open source WebRTC media server framework for your project? Check out this free selection worksheet.

Answering ChatGPT questions about WebRTC

Choosing the best WebRTC signaling protocol for your application

Roey says:

October 27, 2017 at 11:33 am

lots of questions, less of answers 🙂
i hope in the near future the main vendors will give us some support for that stuff…

Reply
1. Tsahi Levent-Levi says:
  
  October 28, 2017 at 5:38 pm
  
  Always the case with new technologies.
  
  From the Kranky Geek event we just had yesterday, I can say that there’s a lot of interest in getting the scale up, and there are many who work now on cascading capabilities.
  
  Reply