The 3 Characteristics of Latency Problems

August 20, 2013

Latency is a problem that I think exist in most problems today. To solve it, it needs to be defined for the problem domain first.

Watch innards

I come from VoIP pedigree. In in that domain, mainly on the signaling and video processing side. To me, latency is measured in hundreds of milliseconds. The way latency is handled in signaling versus video processing is different and that is due to different requirements. I think that latency can be defined by a couple of parameters:

  1. What’s the order of magnitude?
  2. How hard is your limitation?
  3. What’s causing it?

What’s the order of magnitude?

Are we talking nanoseconds? Milliseconds? Hundreds of milliseconds? Seconds? Minutes?

In VoIP, we’re talking about the lower hundreds of milliseconds. Guided missiles are probably below milliseconds (never been there). I learned lately that billing systems are in the lower hundreds as well. Ad serving? Tens of milliseconds. Business intelligence? Seconds. Big Data? Anything between seconds to hours.

First make sure you know where your problem domain lies, as there tend to be different solutions for different latency groups.

How hard is your limitation?

With web page serving for example, there’s no hard limit. Anything less than a second on average should be fine. 10 seconds will piss off your users, but the devoted ones will stay (I know – it took me over a year to decide to improve load times for this blog).

With media processing on a real time video call, 400 milliseconds of latency is good. Pass the second and you can just disconnect the call and go home.

With ads you might miss serving an ad if you don’t decide how to respond to a an ad exchange brokerage request within the allotted time of tens of milliseconds. But as long as you don’t miss the important bids you want (or at least fit the schedule of enough bids) – you are fine.

Billing system – anyone wants to talk money?

Rocket guidance system… well – I think the limit is quite hard.

What I am trying to say here, is that while we strive to low latency in our solution we should at least understand what happens if we miss a deadline, and if the requirement of a specific latency value is a game of averages or writ in stone.

What’s causing it?

This is probably not a requirement, but rather the understanding of what can be done.

With real time video calling, latency is usually caused by the following components:

  1. Camera acquisition time
  2. Video encoder processing time
  3. Media engine processing time
  4. Network driver (on sending and receiving end)
  5. Network time
  6. Media engine processing time on the receiving end
  7. Video decoding processing time
  8. Display buffer(s)

I might have missed a few components that add latency in there somewhere, but it gives you options as to where you have control and where you don’t:

  • WebRTC – probably no control whatsoever
  • Telepresence – as this is a designed system, you even have control over the display buffers (at a price)
  • Enterprise – get a better SLA for the network or a dedicated line
  • And the list goes on

If you are dealing in a domain where latency is important, I think it is important to set the stage and find out the order of magnitude in question, the type of limitation, what causes latency and which of these causes are within your control.


You may also like

Comment​

Your email address will not be published. Required fields are marked

  1. Nice breakdown, Tsahi. I think we have some WebRTC controls:

    + Client-side encoding/decoding and overall processing (jitter buffers, packet loss concealment mechanisms, adaptive resolution and frame rates).

    + TURN/relay server use (and avoidance) and layer 5 session control architectures when in use (select relays that provide lowest end-to-end latency)

    + L5 session control to minimize latency applies to media server/MCU/bridge location selection as well (when in use; understood that today most WebRTC sessions are point to point).

    The last two controls are critical to enterprise as well, even if they do use telepresence, since, as you articulate, the telepresence endpoint engineering itself impacts some of the components, but a good bit of the overall delay budget is in other components such as network, SBC selection and media server selection.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}