Everything you need to know about WebRTC security 🔒

By Tsahi Levent-Levi

April 6, 2020  

Video calling and WebRTC are becoming popular and taking center stage in our lives. Lets see see how WebRTC takes care of security (and privacy).

TL; DR

WebRTC is the most secure voice and video calling technology available today on the market. This is not going to change for years to come. To enjoy that level of security in your application, you will need to work as well. Rest assured that the underlying technology of WebRTC is your best bet.

Need to know what’s your role in securing a WebRTC application? Download this WebRTC security checklist.

In this article, I’d like to go over the reasoning behind why WebRTC is so secure, as well as tackle some pending security/privacy issues in WebRTC.

WebRTC security measures

WebRTC has security front and center, designed into it from the get go

Before we start going into details of the softer security benefits you gain out of WebRTC, it is important to understand how WebRTC is secure to begin with.

The diagram below shows the WebRTC protocol stack as taken from the great High Performance Browser Networking. I’ve taken the liberty of marking the relevant parts of the stack that are targeted at security:

Security in WebRTC’s protocols stack

Signaling

Let’s start with the left hand side. That’s the part that isn’t really WebRTC, but rather WebRTC’s signaling as conducted in the browser. WebRTC uses TLS sessions or QUIC for its signaling transport – both are encrypted in nature. All other avenues for non-encrypted signaling don’t really exist in WebRTC. Theoretically they might work, but some browsers will either block them altogether or require the user to grant access to the camera and microphone on each interaction. Deploying anything serious to production with WebRTC without encrypting signaling for their browser implementations is not a real alternative.

WebRTC “forces” you to encrypt your signaling. What is left out of scope of WebRTC are things like authentication, authorization and identity management. You are free to do as you please in that domain – just make sure to do *something* here – and not leave this wide open for pranksters or worse.

For native applications on mobile, desktop or embedded – you can do whatever you like. That said, the mindset must be the same – mandatory encrypted signaling.

Media

At the end of the day, WebRTC is a media stack, which is what’s on the right hand side.

WebRTC encrypts data sent through it. No way around it.

In the more technical sense, WebRTC sends real-time audio and video over SRTP (=Secure RTP). TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 cipher suite and the P-256 curve are the mandatory to implement scheme. As time goes by, you can expect the cipher suite to be updated in the specification, with the older, more prone to attack suites getting deprecated and removed.

The specification text is very clear about what is mandatory to implement in terms of WebRTC communication security as well as what is forbidden from being implemented altogether.

Data channels are sent over DTLS and are always encrypted – just like voice and video.

The initial key exchange to get to a shared private key for the SRTP session makes use of DTLS-SRTP, a mechanism that negotiates a shared private key without letting eavesdroppers of the data be able to deduce that key easily. Just for good measures, the specification also forbids from using SDES, which means sending keys in the clear inside a transport channel that is considered secure. WebRTC takes the notion of zero trust when it comes to the signaling channel that the developer is using in his application.

The text of the specification is very clear at how serious security considerations are in WebRTC:

Implementations MUST support SRTP. Implementations MUST support DTLS and DTLS-SRTP for SRTP keying. Implementations MUST support SCTP over DTLS.

All media channels MUST be secured via SRTP and SRTCP.  Media traffic MUST NOT be sent over plain (unencrypted) RTP or RTCP; that is, implementations MUST NOT negotiate cipher suites with NULL encryption modes. DTLS-SRTP MUST be offered for every media channel.  WebRTC implementations MUST NOT offer SDP Security Descriptions or select it if offered. A SRTP MKI MUST NOT be used.

All data channels MUST be secured via DTLS.

DTLS-SRTP exchanges certificate fingerprints of self-signed and automatically generated certificates in the SDP which are then matched against the DTLS handshake when the connection gets established. The keys for media encryption are derived during the handshake and are only known to the two parties involved. This only leaves an opening for an active man-in-the-middle attack by the signaling server.

If you are using a media server, then in all likelihood, that media server terminates the encryption, making it a trusted entity on the media path.

Camera and Microphone access

To access the microphone and camera, an application ends up asking permission from the user.

That permission isn’t automatic and cannot be made automatic (unless you build you compile the code on your own and build your own native application).

WebRTC permissions popup in Chrome

Some browsers will remember the user’s decision from one session to another while others will be stricter, asking the user to grant access in each and every use.

The popup window that asks for permission is also not editable or configurable by the application. The intent here is to block any nefarious developers and websites to trick users by editing these dialog boxes.

Screen sharing

All browsers will ask the user for permission before sharing the screen.

All browsers will have their own specialized dialog box for doing that.

All browsers will ask again and won’t remember the user’s choice about sharing his screen from one execution to the next.

All browsers will have their own floating box for stopping screen sharing which cannot be overridden by the application developers.

Local IP and privacy

For years WebRTC was “accused” of having a serious privacy leak (different than security, but sometimes wrapped up with security): WebRTC exposes the local IP address of the browser over JavaScript to get it sent over the signaling channel. This is seen as a bad thing for the privacy oriented.

mDNS is now being added to WebRTC to solve that issue.

To understand more about this, read this article about mDNS and .local ICE candidates.

Open sourcing and standardization & WebRTC security

An important aspect about WebRTC has been mentioned in the previous section and that’s the fact that WebRTC is both open source and standardized.

If you’ve been following me for a couple of years, then you might know what I usually think about standardization:

Standardization processes can be… boring

I’ve had my share of standardization work in the past, before the days of WebRTC so I know how this works. It is a long and arduous process, especially if you compare it to the alternative of go it alone with your own solution.

It is also why we’re now in 2020 and we still don’t have a WebRTC standard specification and just have a draft that is soon to be closed (for the last 5 years or so).

The thing is, that when it comes to security (and privacy), I prefer the design by committee and openness of standardization and open source than the proprietary path.

Why? Because the people who are making these decisions know a thing or two (or three) about security. And they care about it greatly. They are less bogged down by business requirements than a single company is.

WebRTC is also meant to work inside web browsers. A browsers is one of the most challenging environments when it comes to security, hacking and malware.

Not everything is always rosy with WebRTC, but there’s always forward progress. This leads me to the next aspect of WebRTC security.

WebRTC security by standing on the shoulder of browser giants

With WebRTC, you rely on browsers (a good thing)

Browsers are secure by design. They are today’s window to the internet and the world. As such, they are working hard continuously to improve their security – with and without relation to WebRTC.

Here are a few things you might want to consider when thinking about WebRTC browser security:

  • Browsers update frequently and automatically. This makes WebRTC the most secure VoIP solution just by staying up to date. It also has the habit of forcing vendors to keep up to date with browsers, taking care of smoother and more frequent updates of their own
  • Relying on browsers for WebRTC means you put your faith in browser security measures and best practices, and these are far stricter than VoIP alternatives. This makes WebRTC safe
  • An example of what browser vendors are doing can be found in the WebRTC fuzzing effort taken last year, where messages were fuzzed to see if any security breaches can be found (they were found and fixed)

Think of it this way. Browsers target a billion users. At that scale, they are handling and managing a lot more security threats than you will in your service. WebRTC makes use of that scale when it comes to the security mechanisms implemented in it and the scrutiny they enjoy.

Proprietary security is a mess (in video conferencing)

If the doubts that you have about WebRTC security stem from the desire to pick a proprietary solution instead, then know that within the video conferencing space, security is a mess.

Why?

  • Less eyeballs are looking at the security of a proprietary solution. Less people devise the security algorithms put in place. Less people implement them. Less people code review them. Less whitehat security experts try to find holes in them
  • Proprietary solutions come from proprietary vendors, most of which have a business plan and roadmap. Most often than not, investing in security is of second priority to them, especially after raising money, attracting users and getting customers to pay
  • Proprietary solutions not using WebRTC need to figure out how to run in browser environments. Doing that means using hacky techniques, which in turn lead to security breaches. Just ask Zoom about their vulnerabilities related to NOT using WebRTC
  • Some 20+ years ago, when I took a semester in the university to learn software security, the best advice the lecturer gave was never to devise or implement your own security algorithm. Not the low level ones and not the higher level abstractions of security. Why? Because it is really easy to kill a great security mechanism by having a small bug left in there – either a coding bug or a logical bug. Proprietary security = disaster waiting to happen

If you look at the older standardized VoIP solutions, such as SIP and H.323 – they offer encryption and security – but these are optional. On most networks they aren’t even enabled since they reduce capacity of solutions or increase CPU load. In the past, there were a lot of questions on discuss-webrtc about running WebRTC media in the clear – somehow developers thought that’s a good idea.

We need to think about modern networks as zero trust networks. Have each component handle its own security and not rely on external forces to offer security. WebRTC takes that approach exactly.

Your role in WebRTC security

WebRTC is secure. Your application isn’t.

Put differently:

Your application is as secure as you’ll make it.

Make sure it is secure:

  1. Rely on the security WebRTC brings with it, and add on top of it in the context and requirements of your application (a lot of it will reside in the authentication, authorization and identity elements)
  2. Take steps to handle WebRTC security in your application. There are areas where WebRTC will need your help to keep secure
  3. Don’t make beginner mistakes in WebRTC security. They will come to bite you later down the road
  4. Make sure to download and review my WebRTC security checklist (below)

Need to know what’s your role in securing a WebRTC application? Download this WebRTC security checklist.

FAQ about WebRTC security

✅ Is WebRTC secure?

Yes it is.

WebRTC is probably the most secure VoIP protocol out there. This stems from the fact that WebRTC was designed with security in mind, as it needs to operate from inside a web browser in an environment that can never be considered as secure to begin with.

It doesn’t mean that it is faultless – just that when security issues are found, they are usually addressed quickly and they get disseminated to users faster due to the automatic update mechanisms of modern web browsers.

✅ WebRTC IP leak. What is it about?

WebRTC requires local IP addresses to work properly, so it collects these addresses and shares them during the negotiation process.
As this was found and considered to be a privacy issue, a solution is being rolled out in the form of mDNS.

✅ Can I send media unencrypted in WebRTC?

No you can’t. By design, WebRTC encrypts all media sent between users. There is no way (baring changing the implementation itself) to send media in the clear in WebRTC.

✅ If I am using WebRTC, does it means my service is secure?

No. While WebRTC is secure by design, it is just a component embedded to the service you are using. As such, the security it offers depends on the security of the service you are using.

✅ Does WebRTC offer end-to-end encryption?

WebRTC offers end-to-end encryption between terminating entities. If your service runs peer-to-peer (with or without TURN relays) then it is encrypted end-to-end. If you are using media servers along the route (SFU or an MCU) then in all likelihood that server has access to the unencrypted media.

✅ Is the security of WebRTC better than that of Zoom?

Yes, but this isn’t an apples to apples comparison.
WebRTC is a specification. It offers high level of security. An application written on top of it can make it un-secure.
Zoom is a proprietary solution where security isn’t the main objective. We simply don’t know what threats are lurking in the proprietary implementation of the proprietary protocols used by Zoom.


You may also like

Leave a Reply

Your email address will not be published. Required fields are marked

  1. This is a perfect explanation of the security features for WebRTC. It is greatly appreciated as many people think that it is not safe without true end-to-end for multiparty but until that happens this is as safe as it gets! Thanks for the great explanation.

  2. (Not sure why, but my comments keep getting blocked as spam.)

    1. How would the quality be for audio only call? Mesh of 5 people.

    2. Would SFU make it noticeably better? If so, is it primarily due to decreasing 3 Opus encoding or decreasing 3 uplinks for the client?

    Let’s assume decent CPU and stable network.

    1. The quality would depend on the CPU and network, so assuming decent ones, it should be fine.

      An SFU would be noticeably better if… the network or CPU requirements cannot be easily met. Think congestion or just packet loss for one participant as an example.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}