How does WebRTC connect people?

25/03/2019

WebRTC doesn’t really connect people, but the way you think about it signaling is important to your WebRTC application.

Here’s a comment left on one of my recent articles:

WebRTC is… still just a little confusing…Tsahi, i’m reading the book recommended by Loreto & Romano but the examples are outdated. With regards to the SDP signal – if peer A is on a webRTC application, but peer B is surfing youtube – How does peer B get notified of an offer? It would have to go to peer B’s email address right? — because there is no way of knowing peer B’s IP address. Please help.

A few quick things before I dig deeper into this WebRTC connectivity thing:

  • Yap. WebRTC is a little confusing. Maybe even a lot. It doesn’t behave like any other browser technology we have
  • The sad thing about books about WebRTC is that they didn’t age all too well. WebRTC still changes too fast
  • There’s some confusion here in wording – peers, offer, etc.

How well do you know WebRTC? Check it out in my online WebRTC quiz.

Connecting, Signaling and WebRTC

I’ll try to use a kind of a bad comparison here to try to explain this.

Let’s say you are the proud owners of a Pilates studio. You’re the instructor there (#truestory – at least for my wife).

My wife gives Pilates lessons at different hours of the day. These are private lessons so it is rather flexible on both sides. But let me ask you this – how do people know when to come for a lesson?

This being Israel, they usually communicate with my wife via Whatsapp to decide together on the date and time. Usually, people stick to the day of week and time and start communicating only if they can’t make it, want to reschedule or just make sure the lesson is still taking place.

Back to WebRTC.

WebRTC is that Pilates studio. It does one thing – enables live media to flow from one browser to another. Sometimes also non-browsers, but let’s stick to the basics here.

How do the people who need to share or receive that live media connect to each other? That’s not what WebRTC does – it happens somewhere else. And that somewhere is the signaling mechanism that you pick for your own application. I am calling it a mechanism and not a protocol, since it is going to be a tad more confusing in a second.

Or not.

Now let’s go back to WebRTC, signaling and connecting people and look at it from a point of view of different scenarios.

Scheduled Meeting

We’ll start with a scheduled meeting. At any given point in time, I have a few of those coming up. Meetings with clients, partners and potential clients. Here’s one such calendar invitation:

This one happens to take place using Google Meet. Who’s calling who? No one really. I’ll just click that link in the invite when the time comes and magically find myself in the same conference with the other participants.

In most scheduled conferences, you just join a WebRTC link

Where do you get that link to use?

  • Inside the calendar invite
  • In an email that was sent
  • Through an SMS reminder

Some of these services allow inviting people from inside the meeting. That ends up being sent to them via email or an SMS as a link or just dialing their phone (without WebRTC).

Ad-hoc “upgrade” of text chat to video conference

There are ad-hoc calls. These usually start from a chat message.

Often times, I’d rather text chat than do a voice or a video call. It has to do with the speed and asynchronous nature of text. Which means that I’ll be chatting with someone over whatever instant messaging service we select, and at some point, I might want to switch medium – move from text to something a bit more synchronous like video:

Like this example with Philipp – most of our conversations start in Hangouts (that’s where he is most reachable to me) and when needed, we’ll just jump on a call, without planning it first.

Who is calling whom here? Does it matter?

What happens here is that both of us are already “inside” the communications app, so we both have a direct link to the service. Passing that information from one side to the other is a no brainer at this point.

So how will that get signaled? However you see fit. Probably on top of a Websocket or over HTTPS.

I am calling you on the “phone”

What if there’s nothing pre-planned, so it isn’t a scheduled meeting. And we haven’t really been on a text chat to warm things up towards a call. How do you reach me now?

How do you “dial”?

Puneet is one of our support/testing engineers at testRTC. While he will usually text me over slack to start a call, he might just try calling directly from time to time.

What happens then?

I am not in front of my laptop with the Slack app opened. My phone is on standby mode. How does it start ringing on me? What does WebRTC do to get my attention?

Nothing.

The phone starts dialing because it received a mobile push notification. I’ve got the Slack app installed, so it can receive push notifications. Slack invoked a push notification to wake up the app and make it “ring” for me.

The same can be done with web notifications. And there are probably other means to do similar things in IOT devices. The thing is – this is out of scope for WebRTC, but something that is doable with the signaling technologies available to you.

Contact center agent answering calls

When a contact center adopts WebRTC to be able to migrate its agents from using desktop phones or installed softphone towards WebRTC, calls will end up being received in the browser.

This happens by integrating callbars inside CRMs or just by having the CRM implement the contact center part of the equation as well.

What happens then? How do calls get dialed? (the above is a screenshot taken from Talkdesk’s support site)

They go through PSTN towards a PBX. More often than not, that PBX will be based on Asterisk or FreeSWITCH, though other alternatives exist. PBXs usually base themselves around the SIP protocol, which will lead to two alternatives on the signaling protocol that will be used by WebRTC in the browser:

  1. SIP over Websocket. Practically the same thing happening in SIP will happen on the browser
  2. Some proprietary protocol will be used, translated from SIP

In both cases, the contact center agent is registered in advance. It is also marked as “available” in most contact center software logic – this means that incoming calls waiting in the call center queue can be routed to that agent. So it is sitting and waiting for incoming calls. In some ways, this is similar to the upgrade from text chat scenario.

Connecting? WebRTC?

When it comes to actual users, WebRTC doesn’t get them “connected”. At least not from a signaling point of view.

What WebRTC does is negotiate the paths that the media will use throughout the session. That’s the “offer-answer” (or JSEP) messages that pass between one WebRTC entity to another. And even that isn’t sent by WebRTC itself – WebRTC creates the blob of data it wants to send and lets your application send it in any way you see fit.

Still confused? There’s a course for that – my online WebRTC training. The first module (out of eight modules) is free, so go learn about WebRTC.