Low overhead and reliable CPU/memory monitoring using Monitis Custom Monitors

M3 – The Custom Monitors Swiss Toolmonitis m3

M3 has been around for a while, receiving numerous improvements on the way. We can gladly announce it as a mature infrastructure now for server monitoring, together with Monitis. M3 lets you, the end user, easily configure checks and upload the data to Monitis.
And this time – we’ll survey some of the recent additions to M3 together with a nice little example that will measure CPU and memory.

M3 compute / post process plugins

Many times when collecting raw monitoring data, some post processing has to be done, to “beautify” it.
M3 support these type of plugins in the ‘Compute‘ directory.
The README on github provides a description for these plugins, however I decided it is in place to truly explain one of the examples.
Lets have a look at the Math.pm example.
Math.pm allowed us to multiply, divide, substract or add to the final result. Assume we have the following piece of configuration:

 <exectemplate>echo 10</exectemplate> <metric name="Math example"> <type>integer</type> <uom>number</uom> <line>1</line> <math>+5</math> <math>*9</math> <math>-10</math> <math>/5</math> </metric> 

Execution as it would take place on M3 is:

  • Parse line number 1 => 10
  • Add 5 => 10 + 5 = 15
  • Multiply by 9 => 15 * 9 = 135
  • Substrct 10 => 135 – 10 = 125
  • Divide by 5 => 125 / 5 = 25

As you can see, compute plugins can be chained as well and are quite simple to use. In case you need to transform between MBytes to Kbytes or vice versa – or any other “computational” post processing operation – the compute plugins are there for you.
Together with the simple Math.pm plugin we also have a plugin to calculate averages and a plugin to calculate diff per second. Please refer to the README for examples and usage.

M3 Linux Statistics plugin

The Linux Statistics plugin is simply another execution plugin for M3 which uses the Perl module to help you extract system performance counters such as disk, CPU & memory usage.
The usage is too simple to describe and is well documented in this README.
However, here is for instance a way to extract the disk usage on /dev/sda1:

 <monitor name="Disk monitor for /dev/sda1"> <linuxsysstats>diskusage->{'/dev/sda1'}{total}</linuxsysstats> <linuxsysstats>diskusage->{'/dev/sda1'}{usage}</linuxsysstats> <metric name="Total"> <type>integer</type> <uom>KB</uom> <line>1</line> </metric> <metric name="Used"> <type>integer</type> <uom>KB</uom> <line>2</line> </metric> </monitor> 

Isn’t it simple?
Please refer to the official Perl Linux Statistics module for more examples and ideas.

Automatic addition of monitors

Back in the day, actually not long ago! You had to run:

 # ./AddMonitors.pl config.xml 

This would add the Custom Monitors to Monitis before actual data updating could take place.
Guess what – it’ll be done automatically for you now!!
Simply just start uploading data using any of the executables:

  • TimerRun.pl
  • Run.pl

They’ll take care of creating the monitors proper. Pretty neat, ain’t it?

Data caching

It is not rare to have network failures nowadays. Not until long ago if M3 would encounter an error uploading data to Monitis it would just discard the result and continue on. Unacceptable for us at Monitis.
With the new design of M3 we have enabled caching of result data. The meaning is that even if your server lost network connectivity, the results would be stored in-memory by M3 and uploaded to Monitis when the network connection is back. Isn’t it a great feature for reliable server performance monitoring?

It’s business time

The challenge this time was to sub-minute CPU and memory performance over time with Monitis and M3. When I say sub-minute – I mean make an average over time and graph only this result.
Please open this configuration file while you continue reading the article.
Simply what happens in the configuration file is a definition of 2 monitors, one for CPU and one for  memory, both executing a Linux Statistics command.
The interval of the agent is configured at 10 seconds, meaning we’ll sample CPU and memory every 10 seconds.
However, we’re going to use the <avg>, or Average.pm compute module.
The Average.pm module receives a number, in the example it’s 30, and will calculate the average of the next 30 following checks.
So if we have a check every 10 seconds, and an average every 30 iterations, we’re talking about a 300 seconds (5 minutes) average for CPU and memory.
This is how the sub-minuting kicks in – we sample every 10 seconds, but average it and produce a counter every 5 minutes.
If you’re asking me, it works great, in the following graph you can see that I prefer to use my computer during evening times (I prefer to ski during the day 🙂 and on March 17 in the evening I didn’t use it:
CPU Graph

Overhead – or no overhead!

M3 is a comprehensive complement to Monitis. M3 can monitor anything together with Monitis and we’ve shown it in the last few articles.
M3 has a slight memory overhead as it uses various Perl modules, the regular resident size is around 30MBytes. But what is 30MBytes for something that can monitor anything!
In terms of CPU overhead, all I can say that on my very modest Netbook M3 ran smoothly for days, running the mentioned CPU and memory meters. And me? – I even forgot it was running since the overhead was so small.
Don’t believe me? Try it yourself!

With M3 and Monitis anything can be monitored. Follow us on github and twitter.