WebRTC comes with mandatory encryption, which enables privacy, but which type of privacy are you really looking for?
In the past, all the great stuff started in the enterprise and then trickled down to consumers. Now it is the other way around – first features come to consumers and from there find their way to enterprises.
Privacy is no different, but in enterprises it needs to be defined quite differently, making it a totally different kind of a feature.
This is where privacy vs privacy comes to play.
Table of contents
Privacy: The consumer version
As a user, what do you mean when you say privacy?
That the data you generate is yours. Be it sensor related data (think GPS or heart rate). The conversations you have with people are not accessible to anyone else. The same for the photos you take.
Practically, you want no one other than you and those you explicitly share data with to have any access to that data. And that includes the services you use to generate and share that data.
Sending messages over Whatsapp or any other social media service? You probably want these messages to be encrypted on the go, so no one can sniff the network and read your messages. You also don’t want Whatsapp’s employees reading what you wrote.
Essentially, what you are looking for is E2EE – End-to-End Encryption. This means that any intermediary along the route of your communications, including the communication provider himself who is facilitating the session, won’t have the ability to read the content. Simply because it is encrypted using some encryption key that is known only to those on the session.
The enterprise version of privacy
Life for a consumer is simple. At least when compared to an enterprise.
In the enterprise you want this privacy thingy, but somehow you also want governance and the creation of some corporate knowledge base.
When a meeting takes place. Should only the people in the meeting have access? Think about it. Should the people involved in that aspect of the business have access?
Let’s say we’re on a sales call with a customer. And then the sales rep on that call leaves and gets replaced with another one. Should the new sales rep have access to that call that took place and the decisions made in it?
Today, our CRM systems can connect directly to the corporate email and siphon any emails sent or received with certain customers into their account for recording and safekeeping. So we stay in sync with all conversations with that customer.
We may need to store certain conversations due to regulatory reasons. Or we might just want to transcribe them for later search – that internal company knowledge base repository.
There are also times when we’d like to use these conversations we’re having to improve performance. Similar to what Gong does to sales teams.
We don’t want others to have access to these meetings. In some cases, we don’t want the theoretical ability of the provider of the service to access these conversations – think of a Microsoft Teams session, Google Meet or a Zoom call that gets listened to by the employees of these companies.
Privacy in an enterprise looks different than for consumers. It is more granular and more structured, with different rules and permissions at different levels and layers.
WebRTC and privacy
Privacy is king in WebRTC, with a few caveats:
- Only if you let it
- Assuming you don’t screw it up
- When it is of interest to you
Why these caveats?
- Because WebRTC is just a building block – the actual solution is of your making. Which means you can screw it up by architecting or implementing it wrong
- It also means that you want to have privacy as part of your service
And why is privacy king in WebRTC? Because security is ingrained in WebRTC, which means you can use it to provide privacy conscious services.
Lets go over what privacy in WebRTC actually means:
WebRTC mandatory encryption (and security)
In WebRTC, all media is encrypted. You can’t decide to send media “in the clear”. And then the signaling itself is also encouraged to be encrypted, and for all intent and purpose – it is encrypted as well.
This means that if you send audio or video via WebRTC from one user to another or from one user to a media server – then that media is encrypted and can be played only by the recipient.
Someone looking at the bitstream “over the line” won’t be able to play it back or intervene with the content.
Note here that a media server terminates the conversation here and is privy to what is being sent – it has access to the encryption keys. TURN servers don’t have such access.
This mechanism of encryption isn’t optional – it is just there.
E2EE in WebRTC
If we increase the scope to group conversations, then we need E2EE – End-to-End Encryption.
This can be achieved on top of WebRTC using a mechanism known as insertable streams, which ends up as double encryption – one between the sender and the media server. And one between the sender and the receivers on the other end. That second layer of encryption is part of the application. WebRTC doesn’t mandate it or even encourage it – it just enables you to implement it.
Deniability vs governance of communications in WebRTC
Here’s where things can get tricky with WebRTC – it can be used to cater for both ends of the equation.
You can use WebRTC to obtain deniability.
WebRTC has a data channel that runs peer to peer. Using signaling servers to open up such connections to create a loose mesh network of peers means you can send private, encrypted messages from one user to another on that network without having any easy way to trace the communications – let alone to trace its metadata. That’s on the extreme scale of what can be achieved with WebRTC – a TOR/bittorrent-like network.
With the same methodology, I can get two users or even small groups to communicate directly, so that their media travels between them and them alone. Or I can employ E2EE on media servers and get privacy of the content of the communications from the infrastructure used to facilitate it.
You can use WebRTC to handle governance.
On the other side of the equation, you can use WebRTC and force all communications to go through media servers. Media servers which can then enforce policy, record media and provide governance. For some industries and verticals – that’s a mandatory requirement.
And you get these capabilities while keeping the communication encrypted over the internet.
With privacy that’s the biggest question. Who cares?
No one and everyone at the same time.
If you ask a person if he wants privacy the immediate answer is – yes!
And yet… Twitter still doesn’t offer E2EE on DM messages. And people use it.
Whatsapp added E2EE in 2016, when it already had a billion monthly active users. It added E2EE backups in 2021. It seems people wanted it, but not in such high demand to switch to a more secure and private messaging system.
Here’s a screenshot from my own Whatsapp in one of the groups I have:
That weird message is an indication that a friend of mine has changed his security code. This usually means he re-installed Whatsapp or switched a phone I presume. I ignore these messages altogether, and I am assuming most people ignore these messages.
In the same way, companies want and look and strive for privacy and want the services they use to be private. But most of them want it up to a point.
Does that mean privacy isn’t needed? No.
Does it mean we shouldn’t strive for privacy? No.
It just means that people value other things just as much or even more.
CPaaS, Video API and… privacy
When it comes to video APIs and CPaaS platform, it feels that privacy is somewhat lagging behind.
Messaging platforms today mostly offer E2EE. UCaaS are and have been introducing E2EE to their chat services and video calls. Some are offering integration with third party KMS (Key Management Systems) so they don’t have access to the decryption keys to begin with.
CCaaS relies heavily on the telephony network, where, well, what privacy exactly? And they also like to record calls for “quality and training purposes” – which translates to using machine learning and providing governance.
Video CPaaS is somewhere in-between these days – it offers encryption on sessions because it uses WebRTC, which is encrypted by default. But anything going through the media server can usually be accessed by the Video APIs vendor itself. Very few have gone ahead and added E2EE capabilities as part of their solution.
The reasons for that? It is hard to offer E2EE, but it is even harder to offer it in a generic manner to fit multiple use cases. And on top of that, customers don’t necessarily care or will be willing to pay for it, while they will be willing to pay for features such as recording.
Here’s the thing:
Everybody talks about privacy but nobody does anything about it
In the consumer space, we are moving to an E2EE world.
The enterprise space is glacially pacing towards that same goal.
Parallel to that though, machine learning and cloud media processing are shifting the balance back towards less privacy – at least less privacy from the vendor hosting the service.
Which is more important to the buyers of services? Privacy or governance? Deniability or machine learning?