SVC stands for Scalable Video Coding.
SVC is a technique that allows encoding a video stream once in multiple layers. The layers in SVC are akin to the layers in an onion – they can be “pealed off” while maintaining the video, reducing its quality with the reduction of each layer.
This mechanism is useful when:
- Wanting to employ FEC“) to video, as we can protect the stream by adding redundant data to some of the layers only
- Wanting to send the video data to multiple participants, where we can send different layers to different participants based on their capabilities. This assists when using routing mechanisms for conferencing
SVC is added as part of the VP9 codec implementation in WebRTC.
Make sure you also read about simulcast.
What Happens in Simulcast?
Before diving into SVC, let’s first understand what happens in a simulcast setup. In a multi-party call, the SFU (Selective Forwarding Unit) decides which individual stream to forward to each participant. The sender sends multiple media streams, and the SFU then chooses which of these streams to forward to the participants.
SVC: Simulcast on Steroids
SVC takes the concept of simulcast to the next level. Unlike simulcast, where the sender sends multiple streams, SVC involves sending a single media stream with multiple layers. The SFU can then decide which layers to forward, almost like peeling an onion, when sending that data to other participants.
Bitrate Considerations
One of the advantages of using SVC is the efficient use of bitrate. While it does slightly increase the bitrate usage of a single stream, it consumes less bitrate overall compared to simulcast. This is because SVC involves encoding once and reusing previous layers for encoding higher layers of data.
Modalities Within SVC
So, what can you layer within SVC? There are three main modalities:
Temporal Scalability
Temporal scalability increases the frame rate with each layer. Each layer depends on the previous one, so an SFU can drop the upper layers to reduce the stream from 30fps to 15fps without re-encoding. This allows smoother video playback under varying network conditions.
Signal-to-Noise Ratio (SNR)
Here, more bits are invested in enhancing the quality of each frame. This results in a clearer and more detailed video stream.
Spatial Scalability
In spatial scalability, each layer can increase the resolution of the same frame from the previous layers. This is particularly useful for adapting to different screen sizes and resolutions.
SVC vs simulcast: which should you use?
| SVC | Simulcast | |
|---|---|---|
| How it works | One encoded stream, multiple layers | Multiple separate encoded streams |
| SFU can drop layers without decoder | Yes (VP9/AV1 only) | Yes |
| Bitrate overhead | Lower (encode once) | Higher (encode N times) |
| Codec support | VP9, AV1 (not VP8 or H.264) | All codecs (codec independent) |
| Sender CPU | Lower | Higher |
| SFU complexity | Higher (layer handling) | Lower |
| Best for | Large calls, mixed-capability audiences | Broad codec compatibility, H.264 environments |
Rule of thumb: Use SVC when all your endpoints support VP9 or AV1 and you need to scale to large group calls efficiently. Use simulcast when codec compatibility matters more than bandwidth efficiency. If you are just stating out – use simulcast to get something working. Once simulcast is fully optimized, it is time to try out adopting SVC.
Enabling SVC in WebRTC (VP9)
VP9 SVC is negotiated in the SDP via the scalabilityMode parameter in the RTCRtpSender encoding. To request temporal scalability with 3 layers:
const sender = pc.getSenders().find(s => s.track.kind === 'video');
const params = sender.getParameters();
params.encodings[0].scalabilityMode = 'L1T3'; // 1 spatial, 3 temporal layers
await sender.setParameters(params);
L1T3 = 1 spatial layer, 3 temporal layers (the most common VP9 SVC configuration in WebRTC). L3T3 adds 3 spatial layers but requires explicit SFU support.
Check RTCRtpSender.getCapabilities('video') to confirm VP9 SVC is supported before negotiating it.
For the full scalabilityMode string set (the LxTy notation), the codec support matrix, and decision guidance on which mode to pick for which use case, see SVC scalability mode set.


