Enterprise video conferencing using WebRTC.
TenHands is here to change the enterprise video conferencing market by using WebRTC.
TenHands makes an interesting use case for WebRTC. They have decided to tackle the traditional enterprise video conferencing market and do that by using WebRTC natively. To do that, they have taken the extreme approach of tweaking WebRTC to fit the strict requirements of the enterprise market.
Mark Weidick, CEO and founder of TenHands took the time to answer a couple of questions I had about their architecture, service and view of the market. I hope you will find it as interesting as I did:
What is TenHands all about?
TenHands is a platform that others can use to deliver real-time communications in any browser-based application, from Enterprise Social and Collaboration Applications like Jive, Salesforce and Box to applications that are more oriented toward consumers such as Facebook or Yahoo. TenHands integrates seamlessly within the workflow of the host application so that end users gain access to video as an easy-to-use mode of communication.
I noticed you have taken the approach of a plugin for your implementation. Why?
Until WebRTC is fully realized and available in all browsers, our client software is a browser plug-in (no Flash) that’s based on the WebRTC open source project. We extended WebRTC to deliver HD video in Chrome, IE, Safari and Firefox on both Mac and Windows; iPads, Android devices and smart phones are next. As Google and other browser makers add real time video and audio capability to the browser, we will end up with a “no software required” delivery model.
What about your backend – what have you done in that regard?
We use Amazon’s EC2 for video/audio switching on commodity servers in a fully-virtualized environment. This enables on-demand scalability and delivery cost advantages that MCU-based providers cannot achieve and won’t be able to pursue without massive architectural changes. We even have the ability to run in partner or customer owned data centers where desired.
Today we run TenHands on a fungible QoS enabled network that we buy. It has 0% packet loss and 14 PoPs with STUN/TURN infrastructure for firewall and NAT traversal.
And the communication between the web browser client and your servers?
We chose a simple light weight signaling protocol that runs over WebSockets to communicate between the client and the server infrastructure. This provided us the flexibility to support new features fairly easily without being burdened by the requirements of implementing something heavy like SIP.
You have decided to tackle the relatively crowded enterprise video conferencing market. How do you plan on differentiating your offer and compete successfully?
TenHands’ “special sauce” is in its architecture: we deliver personal (desktop, mobile) video as a virtualized cloud computing solution using WebRTC As a result, we gained disruptive cost advantages that translates to great affordability.
Our architecture has the inherent advantage of easy integration with any cloud based applications. For example, TenHands has already seamlessly integrated its real time video and voice with Salesforce, Jive Software, LogMeIn, Box, Dropbox and Facebook. As a result, we are able to power video-centric real time communication in third party applications where client based and hardware offers cannot.
Who do you see as your main competitors in this space? Companies such as Vidtel and BlueJeans? Cisco WebEX? Polycom?
As a video technology TenHands is typically compared to OTT alternatives such as Skype, HDFaces, and Google Hangouts. We expect that these providers will eventually move to WebRTC.
Vidtel and BlueJeans have been primarily focused on interoperability and while BlueJeans has added WebRTC access their architecture and commercial ($200 per port) capability contrasts sharply with TenHands partner centric solution that has a delivery cost structure that enables freemium go to market.
I noticed you took the approach of offering a plugin of your own when browsers don’t support WebRTC. How did that work out for you?
We currently have a software free (plugin free) solution for point to point calls for the Chrome browser (beta version) and Canary channel and will be releasing it soon to our beta users – this is based entirely on WebRTC.
Meanwhile, we will continue to provide a plugin as well to complement the current shortcomings of WebRTC (such as its lack of availability in a majority of browsers) and to include the ability to provide multi-point calling, rich HD video, dynamic media adaptation across a diverse set of network conditions, call statistics, etc.
What challenges did you encounter when using WebRTC and developing a native browser video client?
Providing a commercial offer that works reasonably well and offers business class support required the team to add quite a few things that were missing (and are still missing). We also needed to augment the implementation to account for some additional capabilities required in order to support what would be considered basic for any business class video:
- Lack of robust support for firewall/NAT traversal
- Lack of API’s to support exposing real time audio/video statistics to the application
- Keeping up with constantly evolving standards (none->ROAP->JSEP…)
- Missing support for native screen sharing
- Better traffic shaping algorithms under conditions of latency, loss and jitter to support real world deployments
What would you change in WebRTC given the opportunity?
There are things we learned in the months we have worked with WebRTC. Based on that we would recommend the following:
- A more fine-granular API (compared to the current) that allows implementations of various software stacks in WebRTC (such as the ICE or the RTP/RTCP) to inter-operate with those from others. Building the TenHands service entailed making our (WebRTC-based) client plugin work with a cloud-based media server that uses different ICE and RTP/RTCP stack implementations. Working through this process, made us realize the need for such an API. Such an API would allow, as an example, an implementation with a rich tool-set to talk to a “basic” implementation by configuring the former appropriately.
- As plugin-free incarnations of WebRTC becomes ubiquitous, a rich API that exposes media transmission and reception stats to the application could come in handy as it allow an application such as ours to do a few things that are valuable from our perspective:
- Let the application determine the appropriate action/response (changing bit rate, resolution, frame rate, turn off video etc.) based on application needs
- Allow applications to store call statistics in our backend infrastructure for debugging/troubleshooting/inference purposes
- More control over various media traversal options. Making our application work in various corporate networks often required us to augment schemes that existed in the “off the shelf” implementation of WebRTC. We also wanted control over the “paths” used for establishing a media session: STUN vs. RELAY based on user level policies or dynamically based on certain quality metrics observed during a call (loss, latency, jitter) observed during a call. We further wanted to control the client behavior when new network interfaces were discovered (eg: User started off call with an active WiFi connection; plugged in wired LAN later).
- On the implementation side, it would be nice to get support for more video codecs (specifically H.264 + SVC) and support for TURN based media traversal
What’s next for TenHands?
We just added our own chat and our own screen share capability. iOS is in the works, and then Android. Additional features on the roadmap include recording, indexing and content management (YouTube like) of the content.
We will soon offer an SDK/API to the general public so that everyone can easily add real time video, voice and collaboration tools to their social and collaboration cloud based applications.