WebRTC Monitoring: Do you Monitor your Servers or Your Service?

Sorry if this sounds a bit like a rant or complaint, but I always see incomplete lists of what types of monitoring exist. Monitoring applies to pretty much every application you put on the Internet. It’s really useful to know all the options before you decide what is most important for your application. And you should deploy as many of them as is possible within time and budget.

You might not enable all alerting at first, but it’s smart to collect as much of the information as possible. Then hopefully you have the needed information to look at if someone complains and you can see what is going wrong.

Hopefully these give you ideas what else to do (better):

– Real User Monitoring and other information from the browser -> sent stats to the server about what is happening in at the client. Maybe even save them to localStorage in case they can’t be sent. Seen them too for WebRTC.

– logging from the application -> alerting on certain messages. Logging is very useful in case of looking for the cause of problems. Obviously your application needs to give warnings and maybe some debug information (a debug mode is also really useful).

– passive monitoring: stats of use, performance stats (memory usage, cpu usage, etc.). Alerting of extremely low number of users, larger than usual percentage of slow responses (possibly 95 percentile). And ‘instrumenting the code’. Think of things like statsd as well. This is the bucket most of what New Relic does fits in.

– active monitoring: pingdom and Nagios are in the same category here. And where testRTC fits in I guess. You want it both of the servers (even if just ping, TCP-connect, etc.) and the service (don’t forget monitoring DNS).

Also you can also built in automatic response. If you have a process running on a server and it fails in some way you can use some method to automatically start it again.

Failure in this case can be: ‘passive’ monitoring: the process has stopped running, or active monitoring: send requests and get the right response. The passive kind are node.js forever/daemontools/runit/systemd/Windows Services automatic restart facility. A tool like Monit fits both in the active and passive type category.

Hope this was helpful.

Lennie says:

July 30, 2015 at 12:41 pm

Sorry if this sounds a bit like a rant or complaint, but I always see incomplete lists of what types of monitoring exist. Monitoring applies to pretty much every application you put on the Internet. It’s really useful to know all the options before you decide what is most important for your application. And you should deploy as many of them as is possible within time and budget.

You might not enable all alerting at first, but it’s smart to collect as much of the information as possible. Then hopefully you have the needed information to look at if someone complains and you can see what is going wrong.

Hopefully these give you ideas what else to do (better):

– Real User Monitoring and other information from the browser -> sent stats to the server about what is happening in at the client. Maybe even save them to localStorage in case they can’t be sent. Seen them too for WebRTC.

– logging from the application -> alerting on certain messages. Logging is very useful in case of looking for the cause of problems. Obviously your application needs to give warnings and maybe some debug information (a debug mode is also really useful).

– passive monitoring: stats of use, performance stats (memory usage, cpu usage, etc.). Alerting of extremely low number of users, larger than usual percentage of slow responses (possibly 95 percentile). And ‘instrumenting the code’. Think of things like statsd as well. This is the bucket most of what New Relic does fits in.

– active monitoring: pingdom and Nagios are in the same category here. And where testRTC fits in I guess. You want it both of the servers (even if just ping, TCP-connect, etc.) and the service (don’t forget monitoring DNS).

Also you can also built in automatic response. If you have a process running on a server and it fails in some way you can use some method to automatically start it again.

Failure in this case can be: ‘passive’ monitoring: the process has stopped running, or active monitoring: send requests and get the right response. The passive kind are node.js forever/daemontools/runit/systemd/Windows Services automatic restart facility. A tool like Monit fits both in the active and passive type category.

Hope this was helpful.

You may also like

Voice AI best practices, based on rtcStats data

WebRTC Resilience

Leave a Reply