Should you use Kurento or Jitsi for your multiparty WebRTC video conference product?

05/09/2016

Kurento or Jitsi; Kurento vs Jitsi – is the the ultimate head to head comparison for open source media servers in WebRTC?

Kurento vs Jitsi - which one best fits your needs?

Yes and no. And if you want an easy answer of “Kurento is the way to go” or “Jitsi will solve all of your headaches” then you’ve come to the wrong place. As with everything else here, the answer depends a lot on what it is you are trying to achieve.

Need to pick a WebRTC media server framework? Why not use my Free Media Server Framework Selection Worksheet when checking your alternatives?

Since this is something that get raised quite often these days by the people I chat with, I decided to share my views here. To do that, the best way I know is to start by explaining how I compartmentalized these two projects in my mind:

Jitsi Videobridge

The Jitsi Videobridge is an SFU. It is an open source one, which is currently owned and maintained by Atlassian.

The acquisition of the Jitsi Videobridge serves Atlassian in two ways:

  1. Integrating Jitsi Videobridge into HipChat while owning the technology (it took the better part of the last 18 months)
  2. Showing some open source love – they did change the license of Jitsi from LGPL to APL

Here’s the intro of Jitsi from its github page:

Jitsi Videobridge is an XMPP server component that allows for multiuser video communication. Unlike the expensive dedicated hardware videobridges, Jitsi Videobridge does not mix the video channels into a composite video stream, but only relays the received video channels to all call participants. Therefore, while it does need to run on a server with good network bandwidth, CPU horsepower is not that critical for performance.

I emphasized the important parts for you. Here’s what they mean:

  • XMPP server component – a decision was made as to the signaling of Jitsi. It was made years ago, where the idea was to “compete” head-to-head with Google Hangouts. So the choice was made to use XMPP signaling. This means that if you need/want/desire anything else, you are in for a world of pain – doable, but not fun
  • does not mix the video channels – it doesn’t look into the media at all or can process raw video in any way
  • only relays the received video – it is an SFU

Put simply – Jitsi is an SFU with XMPP signaling.

If this is what you’re looking for then this baby is for you. If you don’t want/need an SFU or have other signaling protocol, better start elsewhere.

You can find outsourcing vendors who are happy to use Jitsi and have it customized or integrated to your use case.

Kurento

Kurento is a kind of an media server framework. This too is an open source one, but one that is maintained by Kurento Technologies.

With Kurento you can essentially build whatever you want when it comes to backend media processing: SFU, MCU, recording, transcoding, gateway, etc.

This is an advantage and a disadvantage.

An advantage because it means you can practically use it for any type of use case you have.

A disadvantage because there’s more work to be done with it than something that is single purpose and focused.

Kurento has its own set of vendors who are happy to support, customize and integrate it for you, one of which are the actual authors and maintainers of the Kurento code base.

Which one’s for you? Kurento or Jitsi?

Both frameworks are very popular, with each having at the very least 10’s of independent installations and integrations done on top of them and running in production services.

Kurento or Jitsi? Kurento or Jitsi? Not always an easy choice, but here’s where I draw the line:

If what you need is a pure SFU with XMPP on top, then go with Jitsi. Or find some other “out of the box” SFU that you like.

If what you need is more complex, or necessitates more integration points, then you are probably better off using Kurento.

What about Janus?

Janus is… somewhat tougher to explain.

Their website states that it is a “general purpose WebRTC Gateway”. So in my mind it will mostly fit into the role of a WebRTC-SIP gateway.

That said, I’ve seen more than a single vendor using it in totally other ways – anything from an SFU to an IOT gateway.

I need to see more evidence of use cases where production services end up using it for multiparty as opposed to a gateway component to suggest it as a solid alternative.

Oh – and there are other frameworks out there as well – open source or commercial.

Where can I learn more?

Multiparty and server components are a small part of what is needed when going about building a WebRTC infrastructure for a communication service.

In the past few months, I’ve noticed a growing requests in challenges and misunderstandings of how and what WebRTC really is. People tend to focus on the obvious side of the browser APIs that WebRTC has, and forget to think about the backend infrastructure for it – something that is just as important, if not more.

It is why I’ve decided to launch an online WebRTC Architecture course that tackles these types of questions.

Course starts October 24, priced at $247 USD per student. If you enroll before October 10, there’s a $50 discount – so why wait? Until I get enrollment automation up, contact me directly.

Need to pick a WebRTC media server framework? Why not use my Free Media Server Framework Selection Worksheet when checking your alternatives?

Responses

Gustavo Garcia says:
September 5, 2016

“I need to see more evidence of use cases where production services end up using it for multiparty”

https://webrtchacks.com/dear-slack/

Reply
    Tsahi Levent-Levi says:
    September 5, 2016

    Gustavo – thanks

    Slack for now is voice-only. While they might add video now that HipChat relaunched it on their service (via Jitsi), who knows if it will still be Janus or not.

    On top of that, Slack is a great reference, but might not be the right one for others. I don’t know how much support they got, how much customizations they made and how much crap they ate along the way – it might be the best thing that happened to Slack – or it might not.

    The track record I see and the recommendations I give are based on multiple variables – it relates to the DNA of the vendor adopting the framework, the feature set he needs, the type of support he is looking for, the scale he needs, the direct feedback I get from others on their use of said frameworks and on discussions with the vendors themselves.

    For now, I am still waiting for more evidence about Janus besides Slack – and not because I have anything bad to say about it.

    Reply
      Lorenzo Miniero says:
      September 5, 2016

      Just for a little “Cicero pro domo sua”, as I’m the Janus main author, here’s a link to a presentation I made a few months ago at Kamailio World:

      http://www.kamailio.org/events/2016-KamailioWorld/Day1/10-Lorenzo.Miniero-Janus-WebRTC-SIP-Gateway.pdf (there’s also a video on YouTube of me presenting this, if you’ve time to watch it)

      Despite the title, it was a more generic overview on Janus in general, and towards the end you can see a (non-exhaustive) list of different products using Janus nowadays, most of them exploiting the SFU plugin. As to Slack, they did everything by themselves without involving us, apart from a couple open discussions on our Google group.

      Hope this helps.

      Reply
        Tsahi Levent-Levi says:
        September 6, 2016

        Thanks for sharing Lorenzo.

        I really love what you’re doing with Janus. I hear great feedback about it from those using it.

        Reply
Philipp Hancke says:
September 6, 2016

XMPP (or rather colibri) is just the control layer. You can write a server for the xmpp component connection to the jitsi video bridge and translate to whatever you want from there… just requires skill.

Reply
    Tsahi Levent-Levi says:
    September 6, 2016

    Well… not sure I like controlling things with XMPP, but that’s just me.

    Reply
      Philipp Hancke says:
      September 6, 2016

      the architecture with frequent renegotiations requires a push-channel. Not a good fit for REST.

      Reply
    Luis Lopez says:
    September 6, 2016

    Hi,
    I’m the Kurento lead and my opinion might be probably biased, but my honest feeling is that Tsahi’s diagnose in this post is quite accurate. Jitsi was designed with a specific videoconferencing model in mind and XMPP makes a lot of sense on it. When creating applications complying with such model, Jitsi makes a great job and using it may save lots of development hours. However, this may be too narrow when special requirements need to be satisfied. First, because using XMPP-inspired control mechanisms is not appropriate for all types of media control logic one may need to have. Just as an example, consider how you would be using XMPP for doing things such as interoperating with IP cameras or smart video devices, controlling computer vision filters, combining media mixing models with SFU models dynamically or orchestrating a complex dynamic media processing topology, etc. Using XMPP-like control mechanisms for those might be a counter intuitive and complex task for developers. Second because extending Jitsi video bridge with further capabilities requires a lot of hacking and deep knowledge of its code internals. On the other hand, Kurento was designed, since its very beginning, as a modular media development framework providing full composability and extensibility. Due to this, Kurento developers use consistent APIs available through programming-language-dependent SDKs that are designed based on software engineering principles (e.g. type protection, efficient management of synchronous/asynchronous calls, efficient use of threads, concurrency control, distributed garbage collection, testability, etc.) These APIs are fully agnostic to any kind of signaling or assumption being the “call model” just one of the possibilities for it. This provides a lot of flexibility but it also has the drawback Tsahi comments: when you just need a standard videoconferencing call flow you might need to develop your own signaling stack and then you might feel that you are reinventing the wheel. For minimizing this effect, on top of the Kurento raw APIs, the Kurento team also created several high-level APIs providing specific signaling such as the Kurento Room API and the Kurento Tree API, but this is another story.

    Reply
      Tsahi Levent-Levi says:
      September 7, 2016

      Luis, thanks for the explanation.

      Let’s see now 🙂

      – Lorenzo from Janus is here
      – Luis from Kurento is here
      – I wonder when Emil from Jitsi will join us

      Reply
    Emil Ivov says:
    September 7, 2016

    Jitsi Videobridge also supports a REST control layer so XMPP is by no means a requirement. Some pretty eminent adopters out there (e.g, HighFive, join.me and others) are using it through REST control.

    Reply
René says:
September 7, 2016

Hi,
I am currently developing a many-to-many video/audio conference solution for a german company which should integrate a public phone conference system via SIP in the future. Therefore I played around with Kurento and Janus. From my experiences so far I can tell that there are advantages and drawbacks in both media servers.
Both of them have interesting approaches regarding the architecture. I like the Kurento way of connecting media endpoints to pipelines. But also the Plugin based approach of Janus offers a lot of possibilities as long as the developer is able to create those native plugins.
In my view the typical use case for multiparty audio/video conferences is a SFU-approach for video and a MCU-approach (mixing) for audio. Mixing video would eat a lot of server resources and it’s not the typical use case that all participant in a big conference show video concurrently. But it’s different for audio. Especially if a gateway to the public phone world is necessary. Here you don’t get around mixing the audio channels. And this is the point where I see advantages on the Janus side. I compared server resource consumption of the Janus Audio Bridge to the Kurento composite element. Janus is using libOpus in the plugin to decode the opus audio, mixes the streams and encodes the mixed stream. This implementation limits the usage to the opus codec but it’s saving a lot of resources compared to the Kurento composite. As far as I understood Kurento uses gstreamer libs and pipelines for all the media processing. This may be much more flexible but leads to much more CPU consumption. My rough measurements showed approx. four times higher CPU load for a audio room with 10 participants.

Reply
RealTimeWeekly | RealTimeWeekly #146 says:
September 12, 2016

[…] Should you use Kurento or Jitsi for your multiparty WebRTC video conference product? […]

Reply
    Rick says:
    December 27, 2017

    I love Jitsi’s feature set, but unless I’m missing something, the documentation is nearly nonexistent! We need to get to market quickly. Any suggestions? Anyone I can hire as advisor/consultant? (It seems like bluejimp, the original jitsi consulting team no longer does consulting since the acquisition)

    Reply
Dejan Popov says:
September 14, 2016

What about Licode? How do you compare it to jitsi or Kurento?

Reply
    Tsahi Levent-Levi says:
    September 14, 2016

    Dejan,

    Truth be told – I haven’t heard any positive feedback on Licode thusfar. It doesn’t indicate that it is bad, but I have no other input to base my view if it from.

    Reply
      Alan demerda says:
      January 4, 2018

      Is this still the same for you in 2018? No news about Licode yet?

      Thanks

      Reply
        Tsahi Levent-Levi says:
        January 4, 2018

        Yes. The dominant players are still Jitsi and Janus in the open source part. Kurento is going down in popularity and use.

        Other alternatives haven’t made enough progress yet – at least not based on the conversations I have with developers.

        Reply
          Alan demerda says:
          January 8, 2018

          Thanks for the update. Really appreciated.

          Reply
          Oluwafemi Matthew says:
          April 21, 2018

          Hi, Tsahi. Nice review of these technologies but what’s your take on mediasoup. We’re a full Javascript(Node) team and while Kurento provides NodeJS support, your latest comments indicates that its usage has began to dwindle. How does Mediasoup compare with Kurento?

          Reply
          Tsahi Levent-Levi says:
          April 21, 2018

          Kurento has everything in a box, which means it is very generic in nature. It also wasn’t really updated in the past year that much, although things should be improving now.

          As for mediasoup, it is still new with a small team behind it and a small ecosystem. I know of projects who’ve opted for using it, though not about their current status.

          Reply
          Oluwafemi Matthew says:
          April 23, 2018

          Thanks for the prompt reply Tsahi. I looked at Kurento during the weekend while it appears as though work has started on it again after almost a year of inactivity. The tutorial repos for the Javascript still appears dated to 6.6.2 as against the current 6.7.* . Moreso it appears to be an overkill for our particular use case though. I think I’ll peruse mediasoup as it appears more lightweight and better suited for our use case. Thanks again.

          Reply
          Tsahi Levent-Levi says:
          April 23, 2018

          Good luck and be sure to update me on your findings – I am really interested in hearing feedback from developers on the various media server alternatives out there.

          Reply
Aruna says:
October 13, 2016

Hi,

Please share if any comparisons with Intel WebRTC SDK solution .

Thanks

Reply
    Sumit says:
    October 25, 2017

    I would appreciate if someone could comment on Intel’s WebRTC Solution, as asked by Aruna earlier.

    Reply
      Tsahi Levent-Levi says:
      October 25, 2017

      There’s very little information to go around on it. Not sure why. It seems to be doing well in Asia, but not much of an adoption for it in Europe and the US as I much as I can tell.

      Here’s the most recent thing I am aware of here: http://webrtcbydralex.com/index.php/2016/09/16/intel-webrtc-collaboration-suite/

      Reply
        Sumit says:
        October 26, 2017

        Tsahi, i agree with you. I haven’t heard from anyone using the same for an enterprise level solution. The main problem for me is “multi-party seems to be limited to 16 participants” in Intel’s SDK. I want to acheive video calls with as many as 100 pax in a single call. I also went through your blog concerning the load testing for Kurento and it was bit dissapointing. Do you still think I can make a 100 pax call work with Kurento in MCU mode,
        with a very high spec server

        Reply
          Tsahi Levent-Levi says:
          October 26, 2017

          MCUs have challenges with load which amount to the cost of the service.
          You can use the Intel one to do that by cascading machines one on top of the other (assuming Intel supports that mode).

          If you don’t need to see all 100 participants at all times, then my suggestion would be to go for an SFU based architecture.

          Reply
Aruna says:
October 27, 2016

Hi,

Which are the typical signaling gateways used with Kurento ?

Thanks,
Aruna

Reply
    Tsahi Levent-Levi says:
    October 27, 2016

    Aruna,

    Frankly, I am not aware of anything dominant or typical when it comes to signaling alongside Kurento. My guess is that it takes one of two forms:

    1. The developers use the Node.js server coming from Kurento and modify it, making it their de facto signaling server
    2. The developers use whatever it is they decided to drive their app interactions with

    If other readers here can share their views and experiences that will be appreciated.

    Reply
      Aruna says:
      November 4, 2016

      Thank you Tsahi for your insights on Kurento . Any experiences with Nubomedia which is an extension to Kurento . Are there any commercial deployments using it ?

      Reply
        Luis says:
        November 4, 2016

        Hi,
        I’m Luis, NUBOMEDIA project coordinator. NUBOMEDIA is a research infrastructure that was created to experiment novel paradigms for combining WebRTC with advanced media processing capabilities in a scalable way. As a research project, it is not devoted to production and it lacks features that are probably required for such purpose (e.g. billing, fault-resilience mechanisms, etc.) Hence, NUBOMEDIA is a good starting point for any organization wishing to create a next-generation WebRTC PaaS, but not for being used direcly in production. In other words, if you are willing to create your very own WebRTC PaaS, evolving NUBOMEDIA may save you thousands of development hours, but further efforts need to be invested before having NUBOMEDIA to be production ready.

        Reply
          Aruna says:
          November 4, 2016

          Thank you Luis for your quick response .

          Reply
      Alan demerda says:
      December 31, 2017

      If i will do the signaling myself on a separate server , which APIs Kurento provide to interact with? Is their a detailed reference for that?

      Reply
    Tsahi Levent-Levi says:
    March 18, 2017

    Shuan,

    I guess it depends on how Kurento will progress from here now that Twilio is in the helm. Better? Worse? Who knows?

    Once Twilio officially releases their own Kurento supported CPaaS capabilities, we will see what gets pushed back into the open source Kurento code and be able to know better.

    Reply
      Phil says:
      June 20, 2017

      Any news here? The project looks dead indeed…. 🙁

      Reply
        Tsahi Levent-Levi says:
        June 20, 2017

        Not dead yet. You’ll need to wait a few more months until there’s news there. That’s my guess

        Reply
          Rick says:
          December 26, 2017

          Well, it’s been a few months. It appears one can still select kurento from the AWS marketplace, though I have not tried it and officially, twilio has stated (back in ’16!) that they are not taking new electricRTC clients. Maybe that only means they are not offering support services for new electricRTC clients, but you are still free to use it. I’m trying to pick the right media server and hosting environment for my needs, and I wish there was some resolution on this. Twilio pricing seems to be $0.001 per participant minute. That’s better than Vidyo.io ($0.01 per participant minute) but I think I can do better by self-hosting.

          Reply
          Tsahi Levent-Levi says:
          December 26, 2017

          Rick,

          I think you’re misinterpreting the Twilio pricing – it should be higher, especially if you are aiming at a media server.

          Look at Janus and Jitsi for viable alternatives. I wouldn’t use Kurento today, as it hasn’t been updated for quite some time.

          Good luck!

          Reply
        Micael Gallego says:
        March 30, 2018

        Hi all,

        I’m Micael Gallego, the new Kurento/OpenVidu projects lead. Kurento is now in a good shape again, with a new team not related to Twilio taking the control again.

        You can read more information here: http://www.kurento.org/blog/kurento-67-moving-forward

        Reply
Sudhi says:
October 24, 2018

Hi
Can you please let us know who are all different vendors who provide customization/integration and general support for the Kurento Stack other than Kurento team itself?

Really appreciate your response here.

Br, Sudhi

Reply

Comment