In Part 3 of this series we turned our attention to NoSQL as another alternative to the traditional relational database, which like Hadoop provides a powerful solution for accessing and managing Big Data insights. Let’s move forward with our discussion by taking a deeper look at the operating framework for how most NoSQL systems work.
There are different kinds of NoSQL databases – Key-Value Pairs, Column Family Stores, Graph Databases, Document Databases – but Key-Value is the most well-known.
The Key-Value Pair (KVP) framework is one in which each record has a primary key and a collection of values (bins) associated with that record. The way Key Values work can be demonstrated in this sample chart below:
Title The Brown Dog
Here you see a ‘key’ in the column and a ‘value’ in the right . . . and notice it can be a string, integer, or the like. Most KVP objects allow you to store any object on the right, because it’s just a value. According to this schema there will always be a unique key for a particular object that needs to be returned. Querying the database for that unique key will return the results back from whichever node has the object.
Since the sample chart above is quite simple, let’s look at a more involved example of the KVP relationship:
user1923_height 6′ 0″
So as you can in this case that the key generation is expressed as the user unique number, an underscore, and then the object. Again, this is a fairly simple variation, but hopefully what is clear is that as long as you define the key on the left and have it be consistently formatted, then it’s possible to extract the corresponding value.
The key value pairs can also become more complex as seen in the next example:
user1923_height 6′ 0″
error_msg_457 There is no file %1 here
error_message_1 There is no user with %1 name
user1923_name Jim Smith
(Key value pair examples taken from http://dba.stackexchange.com/questions/607/what-is-a-key-value-store-database)
Wrapping it up
While most enterprise data warehouses today are still based on the traditional structured model, the realities of Big Data have led architects to consider alternative approaches to data management. Like Hadoop, the NoSQL database is another variation on this theme. As discussed, one of the most common approaches to NoSQL employs the use of Key-Value Pairs (KVP). According to this architecture, each record has a primary key and a collection of values (bins) associated with that record. The framework of NoSQL databases are optimized primarily for retrieve and append operations, which makes them best suited for web applications requiring large data stores that are read-only and not complex.
The NoSQL ecosystem has increased significantly over the last several years as increasing amounts of venture capital has flowed in and job listings and inquiries have increased. While many NoSQL databases lack mature management and monitoring tools, the situation is also improving as the Big Data market continues to surge ahead. Some of the top NoSQL databases on the market today include: HBase, MarkLogic, Cassandra, MongoDB, CouchDB, and DynamoDB.
Hopefully, this series on Hadoop and NoSQL has offered some points of clarity around what these tools are and how they provide important solutions for handling massive amounts of real-time data. As we’ve learned, Big Data is a limitless opportunity to turn more data into deeper insights on a scale rarely seen before. Big Data is the new asset in the global digital universe and mining this asset means gaining access and familiarity to the best tools on the market today. To be a successful business owner you’ll want to make sure that you have a Big Data strategy in place . . . and that Hadoop and NoSQL are part of the solution.