Cassandra is a robust and highly scalable NoSQL datastore that usually consists of multiple nodes spread out across multiple datacenters. If you are the system administrator for a large Cassandra deployment then you might be curious as to how your cluster is doing. In fact your job probably depends on it! So how can you combine a great service like Monitis with Cassandra to make sure you cluster is buzzing along smoothly?
We have done a little bit of the work for you and created an open source Monitis-Cassandra project that can help you monitor your Cassandra clusters in style.Let’s get started, first you need to grab the code:
git clone git://github.com/monitisexchange/Monitis-Linux-Scripts.git
Next, you need to open the file, “settings.py”, and insert your API key and secret (read up on their excelent Monitis API [here]). Then add your column families to the column families list, and you are ready to create your custom Monitis monitors by running the following command:
After the monitors are setup you can start to send data by running:
Set that to run on cron or any other scheduling agent, and you have yourself a fully functioning monitoring system for your Cassandra cluster! Check back in to the Monitis Web site to see all your fresh Cassandra metrics rolling in.
You can see how many nodes are currently running and how many are currently active. This is important, and you might want to set up a Monitis alert to let you know whenever one of your nodes goes down. (Keep in mind that Cassandra should still be running fine with a single node failure, but it is definitely important to stay in the know.)
Since Cassandra is a Java application, you can also get information about the size of the heap space. You might want to set an alert to go off if your Cassandra node is running out of memory, because once it does it will start paging to disk. Now there is nothing wrong with paging to disk if that is what you have to do, but keep in mind that it can be a significant performance hit.
Finally, there are some really fine-grained per column family metrics that come in through this open source tool. You can get read and write latency for each of your column families! This is great! If your read latency is moving above 200ms, then set an alert in Monitis to stay on top of the problem. You can set alerts on the write latency as well. Fine-grained monitoring of individual Column Families is a great way to make sure that your cluster is operating at peak performance.
Also, if you are Netflix then you may want to set an alert in case your permformance ever drop below [1 million writes per second] (http://techblog.netflix.com/2011/11/benchmarking–cassandra–scalability–on.html)! Until next time.