Memcached Server Monitoring With Monitis

The existing Monitis Java SDK (that uses the Monitis Open API functionality) provides almost every opportunity to create any internal monitoring agent that a user could want. It is really very easy to build a monitor by using Monitis Open API and Monitis Java SDK. 

Below we’ll describe possible ways to build a custom monitor for monitoring Memcached server health in real-time – based on improved Monitis Java SDK.

Memcached Server Statistics

The memcached system has a built-in statistics system that collects information about the data being stored into the cache, cache hit ratios, and detailed information on the memory usage and distribution of information through the slab allocation used to store individual items. Statistics are provided at both a basic level of core statistics, and more specific statistics for detailed areas of the memcached server (see https://github.com/memcached/memcached/blob/master/doc/protocol.txt for details).

This information can prove very useful especially when it is necessary to, on the fly, evaluate the state of health of a server.  Usage of these statistics allows you to build a real-time Monitis Memcached server monitor.

Measurement Items

The Memcached server maintains general statistics variables (32 for version 1.4.x) that provide raw information about the state of the server. Most of the variables accumulate the count of occurred events like commands, queries, connections, etc. Sure, that can give you some rough assessment on the state of your server. But usually, filtering, pre-processing and simple calculations are required to get more informative metrics that allow evaluating Memcached server health status.

Remember that most of Memcached statistics variables are counters and they in fact may show only the average estimation of server status, which is generally the long-term trend of tracking processes beginning with start/last restart of the server. But, often, besides the long-term evaluations, short-term evaluations and dynamics of the server are required. The dynamic evaluation is required to see the choppy process behavior and detect real-time dangerous situations.

Since the series of average values can be prepared in case of periodically requesting the system for  permanently increasing values of counters (e.g. per minutes), then the differential behavior of any processes can be calculated by using the following simple formula Pd = (Pt – Pt-1) / dt , where Pd is the average differential value for researched time range, or dt .

Thus, the average, long-term estimation is used usually to see the general trends of processes and detect the inaccuracies and mistakes in server configuration and common behavior. This kind of metric can be shown as a green/yellow/red light signal. On the contrary, the dynamic behavior that is used to show the analysis of movement and the real-time detecting of alarming situations, usually are shown via graphs.

Below are listed the more-or-less widespread metrics that are used in most of server health-state monitoring tools. So, we recommend using them to complement the Monitis Memcached internal monitoring agent.

Metric & Calculation formula

Weight

Graph

Warning

Critical

Percent of open connections to max connections
conn = curr_connections / maxconns

3

Y

>90%

>95%

Percent of keys that have been requested and found present to total number of get commands
get_hit = get_hits / (get_hits + get_misses)

1

<99%

<95%

Percent of items that have been requested and not found to total number of get commands
get_miss = get_misses / (get_hits + get_misses)

3

Y

>0%

>5%

Percent of keys that have been requested to delete and found present to total number of delete commands
delete_hit = delete_hits / (delete_hits + delete_misses)

1

<99%

<95%

Percent of items that have been requested to delete and not found to total number of delete commands
delete_miss = delete_misses / (delete_hits + delete_misses)

3

Y

>0%

>5%

Percent of keys that have been requested to increase and found present to total number of increase commands
incr_hit = incr_hits / (incr_hits + incr_misses)

1

<99%

<95%

Percent of items that have been requested to increase and not found to total number of increase commands
incr_miss = incr_misses / (incr_hits + incr_misses)

3

Y

>0%

>5%

Percent of keys that have been requested to decrease and found present to total number of decrease commands
decr_hit = decr_hits / (decr_hits + decr_misses)

1

<99%

<95%

Percent of items that have been requested to decrease and not found to total number of decrease commands
decr_miss = decr_misses / (decr_hits + decr_misses)

3

Y

>0%

>5%

Percent of current number of bytes used to store items to the max accessible bytes
mem_usage = bytes / limit_maxbytes

3

Y

>85%

>95%

Percent of valid items removed from cache to free memory to current number of items stored
1 – evictions / curr_items

3

Y

<99%

<98%

Memcached process CPU usage
(process_cpu_usage / (process_cpu_usage + cpu_idle))

2

Y

>85%

>95%

Memcached process RAM usage
(1 / (1 + free_memory / (process_memory_usage_percent * total_memory)))
or
process_memory_usage / (process_memory_usage + free_memory)

2

Y

>85%

>95%

Current number of items stored
curr_items

1

Y

Number of worker threads requested
threads

1

Y

Number of bytes read
bytes_read

2

Y

Number of bytes write
bytes_written

2

Y

NOTE: the default values of WARNING and CRITICAL criteria for every metric have been defined with approximation (based on Googling and publishing articles)
NOTE:  the establishing of caches may take a while to reach a state that is representative of normal operations. So, usually, it is necessary to wait some time until the system is stabilized to get correct results.

Legend:

Weight shows the importance of metric while evaluating of the Memcached health status
(3 – high, 2 – middle, 1 – low)

Graph column indicates desirability of calculating and graphically displaying the dynamic behavior of metric instead of showing an accumulated value.

Threshold default values specify the default boundaries for generating warning and critical alerts. When the threshold is crossed, the corresponding warning or critical alert should be generated.

Memcacached Monitis Custom Monitor implementation

The Memcached custom monitor was created by using the mentioned concept, and it has been named “MemcachedMonitor.”You can find the source code for this class in the open source github repository under /monitisexchange/Monitis-Java-Plugins.

Testing the Custom Monitor

We created a Memcached custom monitor and ran it for 15 minutes. We also sent the data every one minute into Monitis. Note that the memcached server was under load (sent and received records within a rate of about 84 mes/sec ) that was generated by Java simulator application, run under Linux Ubuntu 11.04 and using a machine with parameters – Intel Pentium Dual Core 2.4GHz, 2GB RAM.

To view the results, the Custom Monitor widget was added to the Monitis account page by using the Monitis menu (Manage Monitors -> Custom Monitors). We next selected the newly created Custom Monitor and clicked the “Add to window” button. The results of the test are depicted below.

Testing results

The Screenshot we grabbed from Monitis looks like this:

Of course, it is possible also to show any of the measured columns in a graphical view.

Above listed measurements show that the monitored Memcached server is in a perfect working state (missed and evictions items equal to minimum possible values – 0), though it has enough of a big load to reach up to 160 requests per second.

The above listed sample shows that the Memcached server works under quite a big load (up to 500 req/sec), although some problems yet exist in finding some keys in the server (get_miss – the number of non-found keys, should be as near as possible to zero). However, sometimes it could be a normal situation for the consumer to wait for some keys, periodically asking the server for them.

We at Monitis hope this article helps you easily build a monitor by using Monitis Open API and Monitis Java SDK.  Look for more informative pieces upcoming in this blog — to make your life easier. And signup for Monitis free trial here.

You might also like