WebRTC's Extremes. Aggregation or Embedability? Federated or Siloed?

By Tsahi Levent-Levi

August 10, 2015

WebRTC is but a technology. Its adoption happens at the edges.

It is interesting to see what people do with WebRTC – what use cases do they tackle and what kind of solutions do they come up with.

Here are a few opposite trends that are shaping up to be mainstream approaches to wielding WebRTC.

1. Aggregation

In many cases, WebRTC is used to aggregate. The most common example is expert marketplaces.

Popexpert and 24sessions are good examples of such aggregators. You open up your own page on these services, state what services you offer and your asking price. People can search for you and schedule a video session with you. Interesting to see in this space LiveNinja who recently shutdown their aggregation service, shifting towards and embedability alternative.

2. Embedablity

The opposite of aggregating everyone into a single domain is to enable embedding the service onto the expert’s own website.

The company will offer a piece of JavaScript code or a widget that can be placed on any website, providing the necessary functionality.

Aggregation of Embedability?

Which one would be preferred, and to whom?

The Vendor in our case, has more power as an aggregator. He is in charge of all the interaction, offering the gateway into his domain. Succeeding here, places him in a position of power, usually way above the people and companies he serves.

The Expert may enjoy an aggregator when he is unknown. Having an easy way to manage his online presentation and being reachable is an advantage. For someone who is already known, or that have spent the time to make a brand of himself online, being aggregated on someone else’s site may dilute his value or position him too close to his competitors – not something you’d want doing.

The Customer on one hand, can easily find his way through an aggregator. But on the other hand, it places the expert or service he is reaching out to at a distance. One which may or may not be desired, depending on the specific industry and level of trust in it.

Ben Thompson has a good read about aggregation theory which I warmly suggest reading.

3. Silo

Most WebRTC services live in their own silo world. You envision a service, you build the use case with WebRTC, and that’s it. If someone needs to connect through your service – he must use your service – he can’t get connected from anywhere elsewhere. Unless you add gateways into the system, but that is done for specific needs and monetization.

I’ve talked about WebRTC islands two years ago. Here’s a presentation about it:

WebRTC Islands from Tsahi Levent-levi

WebRTC makes it too easy to build your own island, so many end up doing so. Others are hung up to the idea of federations:

4. Federation

Why not allow me to use whatever service I want to call to you, and you use whatever service you prefer to receive that call?

Think calling from Skype to WeChat. Or ooVoo to Hangouts. What a wonderful world that would be.

Apparently, it doesn’t happen because the business need of these vendors isn’t there – they rather be their own silos.

Who is federating then?

Some connect to the PSTN in order to “federate” – or to enjoy the network effect of the legacy phone system
Those who have a network already (federated or not), end up using WebRTC as an access point. That’s what Polycom did recently with their RealPresence Web Suite.
Solutions such as Matrix, looking to offer a framework that enables federated signaling that is suitable for WebRTC as well

Why is this important?

At the end of the day, WebRTC is a building block. A piece of technology. Different people and companies end up doing different things with it.

Need to understand WebRTC and how to design and architect real world solutions with it? A first step is to understand the servers used to connect WebRTC.

Answering ChatGPT questions about WebRTC

Choosing the best WebRTC signaling protocol for your application

Aswath Rao says:

August 10, 2015 at 4:19 pm

It is indeed surprising that Embeddabiity is not opted by all WebRTC applications. After all it comes for free, given that URL is the natural way to access any WebRTC service and URLs can be embedded in any number of places. There is no need to develop any special Javascript code or a widget. I can see why “market creators” would not adopt embeddability, since they want to confine conversation within their platform. But not others. After all why would they write off a major benefit of WebRTC.

Regarding your Silo/Islands and Federation, I am afraid I have to repeat my comments that I have made before. Typically, in the context of communications, terms “Silo” and “Islands” are used in negative light. I am not sure you mean to insinuate that connotation here. Personally I prefer the term “Moat”. Whereas Silo and Island suggest that communication is allowed only between people who are already in that Silo or Island, Moat suggests that outsiders may be permitted on a case by case basis. Silo and Island suggest rigidity, while Moat is flexible and dynamic.

The concept of Federation is gotten hold of WebRTC world even though it is not needed at all. I am not sure why it is so, except for my suspicion that advocates of Federation feel an aura of sophistication and enlightenment. Hitherto, federation between different communication services is critical because the clients were service specific. But WebRTC has done away with all the reasons that require federation:
1. the clients are general purpose browsers,
2. the signaling protocol is dynamically downloaded
3. identity can be separated out by using authentication schemes like OpenID Connect or other single sign on schemes.
So an app that that is using WebRTC does not have to federate with a third party app/service to let their users to contact its users.

You being an evangelist of WebRTC, are undercutting the main advantage of WebRTC by perpetuating the dichotomy. Every WebRTC app should embrace embeddability and it is simple to realize the benefits of federation without relying on third parties.

Reply
1. Tsahi Levent-Levi says:
  
  August 10, 2015 at 5:47 pm
  
  Aswath, thanks for the comment and for tagging me as an evangelist of WebRTC 🙂
  
  The whole notion of this post is to reflect how people are using WebRTC and not to decide which is the preferred method. I have my own opinion on the subject, but I don’t think I made it clear in this post what that opinion is.
  
  I’d also say again that WebRTC is a technology. There is no right or wrong way of using it. How people end up using it is what is interesting.
  
  Reply
2. Matthew Hodgson says:
  
  August 10, 2015 at 6:55 pm
  
  Aswath: the main reason to federate rather than embed is if you want to limit the number of client SDKs you trust and have to implement against.
  
  For simple use cases like plain group chat/voice/video, why (as a developer) should I have to dynamically download a different signalling stack for every different solution I want to talk to? Why should I trust the code of these different stacks? Why should I have to develop and integrate against all the different APIs they expose?
  
  An alternative is to use something like Orca.js or WONDER as a clientside SDK which both embeds and aggregates vendor-specific SDKs into your client. In the end it’s just a matter of subjective engineering taste if you’d rather aggregate in the client by embedding loads of different signalling stacks, or talk a standard signalling protocol and rely on bridges for interoperability with proprietary systems for basic communication.
  
  Obviously for more domain specific communication (e.g. collaborating on medical imagery or whatever) then embeddable SDKs are a huge advantage for WebRTC. But it doesn’t mean that they’re the one true solution for basic comms…
  
  Reply
  1. Aswath Rao says:
    
    August 10, 2015 at 8:20 pm
    
    Mathew: I would like to confirm that your focus is exclusively on a native client app, since you mention only about integrating client SDKs. Do you grant me that a browser-based use case does not have these issues? If so, can’t a native client act like a browser when it is initiating a session with an external application? I mean the client access the external server, dynamically download JS etc just like a browser will do upon visiting a WebRTC call URL?
    
    Reply
    1. Matthew Hodgson says:
      
      August 10, 2015 at 9:06 pm
      
      Aswath: no, I was talking primarily about web apps here. The SDKs i was mentioning were the javascript libraries you’d use to embed different vendor’s WebRTC features into your webapp.
      
      Assuming the problem is that you have user 1 in webapp X (e.g. Talky.io) who wants to talk to a user 2 in webapp Y (e.g. appear.in), your choices are:
      
      1) Give up and just force one of the users to create an account on the other service, and be forced into using it, even if it wasn’t their preferred option. The call details get stuck in the single service that’s used. Meanwhile, one of the apps loses a user for this call.
      
      2) Support embedding a call with a user on app Y into app X. For instance, you might be able to load a copy of app Y in an iframe in app X. This relies on Y providing an embeddable version though, and on X in knowing how to embed it. It’s going to have a horrible user experience, as X has no control the look and feel or behaviour of the embedded Y.
      
      3) Support using Y’s web SDK to call Y from X. This again relies on Y providing a web SDK, and X having integrated it. It could be a good user experience though.
      
      4) X uses a clientside aggregation SDK like Orca or WONDER to aggregate different SDKs on the client behind a single API, so when you want to call users on Y or Z you have the same API available. This relies on Y or Z having a client SDK and the aggregation SDK having integrated against it.
      
      5) X uses some kind of aggregation service to call users on Y or Z by using the aggregation service’s proprietary signalling API. This relies on Y or Z having an API that can be accessed through an aggregation service, and that service having integrated against them.
      
      6) X uses some kind of standard federated signalling protocol (e.g. SIP, XMPP, Matrix) to talk to users on Y or Z, either directly or through a bridge of some kind. This relies on Y or Z either speaking that protocol or being accessible via a bridge.
      
      As Tsahi says, there are many ways of doing this. Personally I think that options 6 and 4 are the cleanest. However, it’s absolutely true that not all use cases require federation – this is only solving the situation where you have a user on system X who wants to have a basic contextless conversation (chat, voice, video) with a user on system Y.
      
      Reply
      1. Aswath Rao says:
        
        August 11, 2015 at 10:57 am
        
        Matthew:
        
        Thanks for the patient explanation. Just for your information, I think option 1 is feasible without the deficiencies you have identified. How far I have gone in eliminating them is a judgement call. I hope soon others will be able to evaluate it.
      2. Matthew Hodgson says:
        
        August 11, 2015 at 1:30 pm
        
        Aswath: (replying here as Tsahi’s blog won’t let me answer your post directly): I agree that option #1 is a unique benefit of WebRTC – that you can just click a URL to launch a new site to call someone. However, if users on site X has to launch site Y to call a user on site Y, they lose ownership of the call and lose control over the whole user experience which is then inconsistent for the end user.
        
        If I’m Talky.io, I don’t want my users to be teleported into an entirely different brand and UX when they call someone on Appear.in or vice versa. I’m genuinely interested in ways of eliminating this problem, though – so look forward to seeing your solution! 🙂
Tsahi Levent-Levi says:

August 11, 2015 at 1:33 pm

Guys,

For the most part, generic calling from site X to site Y is dead.

If I am on Facebook, I interact with people on Facebook. I don’t want or care to interact with people on Twitter from Facebook – and if I do – Facebook most certainly don’t want me to.
Facebook is also big enough for me not to care about most people not on it in my circle.

As for site X calling site Y – If I am a dating site X, why the hell would I want someone from realtor site Y to be able to dial in to my users? Where’s the benefit?

Moving forward, comms is going to be broken down to meet the needs of specific use cases and services, and the general use case of just calling *someone* will fade away.

Reply
1. Matthew Hodgson says:
  
  August 11, 2015 at 1:45 pm
  
  Tsahi: I agree that there are scenarios where calling from site X to site Y makes zero sense. There is no reason for users on a dating site to be able to call users on a real estate site, as you say. However, the idea that domain/context-specific use cases are somehow going to unilaterally sweep away generic freestyle communication is illfounded.
  
  Right now, *EMAIL* has become the standard federated signalling layer for setting up freestyle collaboration/communication between arbitrary groups of people. If I want to talk to someone at another company – e.g. an interview, or social discussion, or abstract brainstorming or whatever, right now my best bet is to *email* them and try to force them to use some platform I’ve picked (Hangouts, Skype, Slack, whatever). As a result, the person I’m calling loses control of their experience, and the conversation gets trapped and fragmented in whatever silo we end up in. This is a huge step back from the flexibility and freedom of email, let alone the PSTN.
  
  So: we need both. WebRTC is great as it makes it easy to jump into a rich domain-specific contextual discussion when needed. But it totally drops the ball on the simple case of contextless communication… which leaves us stuck either in silos, the PSTN or email. I’m afraid that not all communication is contextful 🙂
  
  Reply
2. Aswath Rao says:
  
  August 12, 2015 at 2:42 am
  
  Generic calling from site X to site Y is not dead. Far from it. And some use cases can really benefit by using WebRTC. Let me elaborate using a specific use case.
  
  Consider the case of two companies which have deployed autonomous UC platforms and employees of these companies want to communicate with each other. A most common technique is to fall back to basic PSTN connectivity, but then lose all the enhanced features of their UC platforms. If their vendors have enabled federation between their platforms, then at least theoretically they can federate among themselves. Granted that the vendors have solved the technical issues related to federation; but there remains the administrative issues. The two companies have to agree to federate at a policy level, then the admins have to configure their systems accordingly and test it out. This is a big logistical nightmare. Indeed, Skype4B (previously known as MS Lync) makes federation procedurally simple. Still, not many are openly federating; the participation rate is very low. An often repeated reason is the promiscuous federation. What enterprises want is to define the scope of federation in a narrow fashion.
  
  This is where WebRTC comes to the rescue, by making the decision unilateral and can define the scope of federation as narrowly as is needed. Enterprise Y can provide temporary guest privilege to an employee of Enterprise X who uses a WebRTC-enabled browser to initiate a communication session with one in Y. Y can use HTTP-based authentication schemes like OpenID. Additionally Y can use features like Attribute Exchange to decide on the level of privilege based on the caller’s functional responsibility and such factors. This is a simple way of realizing the service objectives of federation but without the heavy administrative overhead. We have developed a system that utilizes this form of “federation”. In his comments,
  
  Matthew has identified a couple of issues in this scheme. A substantive issue he raises is that the caller or the caller’s system may not have a record of this session. We have addressed this by requiring Y to use OpenID Connect to post a short notification in the caller’s system which the caller can use to access the full details at Y. We have not addressed his other major concern – “unfamiliar UI”. I can only make a defensive claim that the GUI is self-explanatory and the hope is that callers will not be stymied. But the benefits realized in simple “federation” more than compensates the potential risk of unfamiliar UI.
  
  Reply
  1. Matthew Hodgson says:
    
    August 13, 2015 at 7:43 pm
    
    Aswath: I finally understand where you are coming from on this 🙂 Thanks for spelling it out. And glad that we are agreed that communication between different application environments is still a very relevant problem! The idea of using WebRTC to allow users from app X to temporarily guest into app Y is an interesting one and it sounds like you’ve mitigated some of the problems with #1. To me, this is complementary to actual federation. If you /do/ have open federation (be it Skype4B flavoured SIP or XMPP or Matrix or whatever) then federation will hopefully be the best experience for lowest-common denominator communication. But if you don’t have open federation, then this kind of guest access approach sounds like a great pragmatic alternative. Good luck!
    
    Reply
Rajnish says:

August 18, 2015 at 11:37 am

Have you heard about Fring Alliance. That is federation of OTT Platform for service providers.

Reply
1. Tsahi Levent-Levi says:
  
  August 20, 2015 at 8:52 pm
  
  Yap. Know about Fring Alliance.
  
  Reply

WebRTC’s Extremes. Aggregation or Embedability? Federated or Siloed?