Over the last couple of months, Monitis, through a series of blog posts, has provided guidance on picking the right NoSQL database storage tool that meets your company’s needs. In our previous blogs, we offered a comprehensive overview of why NoSQL technology is important and how it compares with Relational Database Management Systems (RDBMSs).
In our posts, we have also reviewed various brands — again, in the hopes that this information will help choosing NoSQL DBs such as Apache Cassandra, MongoDB, CouchDB, Redis, Riak, HBase and others…easier.
In this, our last post on the subject, let’s take a look at two more popular brands: Apache’s CouchDB and Membase.
If you’re looking for a database that is similar to Lotus Notes — you’ve found it. And that’s not a shock because its creator, Damien Katz, worked on Lotus Notes at IBM before embarking on this project in 2005 to build a web-based database. (It became associated with Apache in 2008.)
Here is some more detail about CouchDB that may be useful to you:
- Orientation: Document
- Implementation language: Erlang
- Distributed: Yes. In fact, the way it works is that data can be read and updated by users and the server while disconnected, and any changes can then later be replicated bi-directionally later.
- Schema: Good news — none required. CouchDB documents are stored using JSON, and each document is assigned a unique ID.
- Client: It is RESTful JSON API, and that means users can access CouchDB from any language capable of making HTTP requests.
- CAP: Replication is used to synchronize multiple copies of data on different nodes.
- Production stage: CouchDB is not yet in a 1.0 release. But it is used in production in a variety of social websites and software applications.
- Additional features: CouchDB supports MapReduce, incremental replication, and fault-tolerance. A bonus feature is a web console.
Membase was first created in January 2009, and is open-source, distributed technology. It was released in 2010. (As of this writing, it has merged with CouchDB.) Membase also has a commercial version that uses Memcached, a widely deployed distributed data caching technology. Membase’s mission for existence is really quite simple: the tool adds a key-value store to replace the RDBMS in use today for the overwhelming majority of web applications.
As we mentioned, Membase is available in a community edition and an Enterprise Edition (for businesses). The two versions differ technically only in the number of nodes they support: the open-source community version has licensing restrictions (the maximum usage is two node clusters).
Here is some more detail about Membase:
- Orientation: Key/Value store (based on Memcached)
- Implementation language: C, C++ (data manager); Erlang/OTP (cluster manager)
- Replication: Replication (sharding) can be set at the virtual bucket level (MemBase currently supports up to 4096 vBucket) which contains or owns a subset of the key space. Membase hashes keys to “buckets”, not servers, and the number of buckets remains fixed. Buckets are a way to divide up the entire dataset into equal chunks.
- Schema: Membase doesn’t enforce a specific schema; the application can store any “value” it likes against a specific key.
- Client: embedded CLI utility that leverages the REST interface,
- CAP: MemBase is a CP type system.
- Customers using Membase: Comcast, Mochi Media.
- Extra features: Generally memory-oriented, the set of data in memory at any given point in time, however, is only the hot data. Data is persisted to disk by MemBase asynchronously, based on rules in the system.
Now that we have reviewed all the major NoSQL systems, it’s our judgement that all of the products are very useful but should be investigated more by sysadmins to obtain extra detail and functionality. We highly recommend this before making a final decision. But, for now, it can be presumed that probably Apache Cassandra is preferable.
Anyway, the final decision should be done by taking the user’s perspective into account. For example, you might ask or inquire about the following details:
|Important Functional Features||Questions|
|Prepared Statements||Is there support for server-side prepared statements?|
|Built-in functionality||Is there embedded functionality (like sort, search, etc.)?|
|Monitor and analysis||Are there embedded modules for monitoring and analysis?|
|Map/Reduce support||Does the product support the Map/Reduce functionality? How well?|
|Auto-compacting||Does the product support auto-compacting of data (garbage collector)?|
|Caching||Does the product support the caching mechanism?|
|Backup/Restore||Does the product support backup/restore functionality?|
Consider these factors, too:
|On-line support||Online support allows to get faster response and answers to the questions.|
|Development community||High activity is important because it shows that the product is widely used and answers to many questions can be found without much effort|
|Easy learning curve||How easy is it to learn the main features and peculiarities of the product and is much time required for starting using it?|
No matter which NoSQL database tool you choose, you should always remember that like any critical application that makes your business run smoothly, they should be independently monitored. When you keep an eye on your database server 24/7, you will get instantly informed if something should go wrong with functionality — or users can’t access it.
Good luck, and happy NoSQL hunting!
Monitis is a 100% Cloud-based, complete, and flexible IT monitoring platform which consolidates back-end server monitoring, application monitoring, website monitoring, and cloud monitoring in an all-in-one, central monitoring service. The platform is customizable and may be used for monitoring of all kinds of IT assets such as websites, servers, routers, switches, VoIP devices, DNS, databases, processes and IP devices.