Choosing the Right Metrics for Monitoring your Application

When it comes to monitoring various applications, it is critical to define the key metrics to monitor, in order to get the precise picture of how your application is performing. That sounds easy enough, but the “simple” idea of defining key metrics is hard to implement in reality, because:

  1. The absolute number of metrics may be quite large (i.e. 250+)
  2. It can be hard to evaluate the importance of each metric by just looking at their names
  3. This process can be extremely time-consuming

Here at Monitis, we’ve spent a lot of time and effort endeavoring to come up with the right metrics for whatever application that needs to be monitored. That’s why when you set up your Monitis account, you don’t have to choose from dozens of metrics for Apache or MySQL.  Instead, our research team has pre-selected the most important metrics for you.

Today we would like to share with you a method we use for defining the right metrics for every application. This research is an ongoing experiment and we’d like to ask you to contribute to our research by discussing the idea and suggesting your own ideas.

The Problem

We have an application written in Java , and we need to define what are the metrics we should monitor , in order to get the most realistic picture of the application’s performance.

The Goal

Our goal is to define a minimal set of descriptive metrics which will fully describe the state of the system.

The preconditions

  • The monitored system is considered as a black-box system with one input and several outputs.


  • The input parameter of the system is generated load.  You can also consider as an input parameter the configuration settings, but for simplicity we will build our example using a fixed configuration and environment.


  • Observational parameters – all numeric outputs are available through Monitis.  In our example, we used a simple Java application to get all available data provided trough JMX, the output is a CSV file with one minute interval measurements.


  • In order to get the most from the behavior pattern of system, observations will be made under a stress test.


We used simple load simulator, increasing load by fixed steps in equal intervals. Sure, you can use any, e.g. JMeter, httperf, etc.


The Proposed Solution

We suggest using a combination of two methods for defining the minimal set of descriptive metrics. These methods are described below, giving you the core idea behind them.

In the first step we need to filter the unnecessary information (metrics) from the necessary ones, by applying a method that might be familiar to those of you who have learned mathematical analysis at university. The magic method is called Principal Component Analysis.

Principal Component Analysis- a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components.

We will use this method to find optimal k components with cumulative variance close to the threshold (95 – 99%)

The second step would be to group the necessary metrics in clusters, wherein the metrics correlate with each other to some degree.

For this we will use another mathematical method, called «K-Medoid clustering». As wikipedia explains, this model is a clustering algorithm related to the k-means algorithm and the medoidshift algorithm. Both the k-means and k-medoids algorithms are partitional (breaking the dataset up into groups) and both attempt to minimize the distance between points labeled to be in a cluster and a point designated as the center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints as centers (medoids or exemplars) and works with an arbitrary matrix of distances between datapoints instead of .


The Outcome

Although the common count of metrics reaches 250 (in our case), as an outcome we’ll have 7  distinct clusters of metrics, where the metrics within a given cluster are correlated but the metrics of two or more distinct clusters are not correlated with each other. This would mean that there are potentially at least 7 metrics which we absolutely must use, in order to get an idea about how the application works.

The challenge we face here is that we now need to figure out the metrics to be used from each cluster. This can be a difficult task, since there might be  dozens of metrics in each cluster.

One way of filtering the important ones is to look at the ones which have the smallest distance from the «center» of the cluster.

Another way is to dive one layer deeper and create sub-clusters in each cluster and find even more correlated metrics. This way we would have a bigger cluster that has three sub-clusters, for example. That would mean that the bigger cluster can be described by monitoring the metrics from the sub-clusters.