MOS is a measurement of the perceived quality of audio streams in VoIP.
For a given audio input, it generates a value between 1 to 5 where 1 translates to bad audio quality and 5 translates to excellent audio quality.
MOS is subjective in nature, and relates to the perceived quality of people doing the measurement and scoring what they hear. That’s at least the philosophy behind it.
Calculating MOS
MOS can be calculated from one of three different sources:
- The actual audio being played
- Low level RTP and RTCP metrics of the audio being received (or sent)
- Lacking both the above, whatever is available to us
MOS in WebRTC
In the case of WebRTC applications for example, we are usually only exposed to a fraction of the information in RTP and RTCP through the getStats API call in WebRTC. This information can still be used and is used to calculate MOS scores by WebRTC applications.
Video and MOS
There have been several attempts to create a similar scoring for video. None of them took root and succeeded to the extent MOS has for voice. Likely due to the complexity associated with video.