Big Data Tools that You Need to Know About – Hadoop & NoSQL – Part 4


In Part 3 of this series we turned our attention to NoSQL as another alternative to the traditional relational database, which like Hadoop provides a powerful solution for accessing and managing Big Data insights. Let’s move forward with our discussion by taking a deeper look at the operating framework for how most NoSQL systems work.


Big Data Tools that You Need to Know About – Hadoop & NoSQL – Part 4   Key-Value Pairs are the ‘Key’




There are different kinds of NoSQL databases –   Key-Value Pairs, Column Family Stores, Graph Databases, Document Databases – but Key-Value is the most well-known.


The Key-Value Pair (KVP) framework is one in which each record has a primary key and a collection of values (bins) associated with that record. The way Key Values work can be demonstrated in this sample chart below:


Color        Red

Age          18

Size         Large

Name         Smith

Title        The Brown Dog


Here you see a ‘key’ in the column and a ‘value’ in the right . . . and notice it can be a string, integer, or the like. Most KVP objects allow you to store any object on the right, because it’s just a value. According to this schema there will always be a unique key for a particular object that needs to be returned. Querying the database for that unique key will return the results back from whichever node has the object.


Big Data Tools that You Need to Know About – Hadoop & NoSQL – Part 4











Since the sample chart above is quite simple, let’s look at a more involved example of the KVP relationship:


user1923_color    Red

user1923_age      18

user3371_color    Blue

user4344_color    Brackish

user1923_height   6′ 0″

user3371_age      34


So as you can in this case that the key generation is expressed as the user unique number, an underscore, and then the object. Again, this is a fairly simple variation, but hopefully what is clear is that as long as you define the key on the left and have it be consistently formatted, then it’s possible to extract the corresponding value.


The key value pairs can also become more complex as seen in the next example:


app_setting_width      450

user1923_color         Red

user1923_age           18

user3371_color         Blue

user4344_color         Brackish

user1923_height        6′ 0″

user3371_age           34

error_msg_457          There is no file %1 here

error_message_1        There is no user with %1 name

1923_name              Jim

user1923_name          Jim Smith

user1923_lname         Smith

Application_Installed  true

log_errors             1

install_path           C:\Windows\System32\Restricted

ServerName             localhost

test                   test

test1                  test

test123                Brackish

value                  key


(Key value pair examples taken from


Wrapping it up


While most enterprise data warehouses today are still based on the traditional structured model, the realities of Big Data have led architects to consider alternative approaches to data management. Like Hadoop, the NoSQL database is another variation on this theme. As discussed, one of the most common approaches to NoSQL employs the use of Key-Value Pairs (KVP). According to this architecture, each record has a primary key and a collection of values (bins) associated with that record. The framework of NoSQL databases are optimized primarily for retrieve and append operations, which makes them best suited for web applications requiring large data stores that are read-only and not complex.


The NoSQL ecosystem has increased significantly over the last several years as increasing amounts of venture capital has flowed in and job listings and inquiries have increased. While many NoSQL databases lack mature management and monitoring tools, the situation is also improving as the Big Data market continues to surge ahead. Some of the top NoSQL databases on the market today include: HBase, MarkLogic, Cassandra, MongoDB, CouchDB, and DynamoDB.


Hopefully, this series on Hadoop and NoSQL has offered some points of clarity around what these tools are and how they provide important solutions for handling massive amounts of real-time data. As we’ve learned, Big Data is a limitless opportunity to turn more data into deeper insights on a scale rarely seen before. Big Data is the new asset in the global digital universe and mining this asset means gaining access and familiarity to the best tools on the market today. To be a successful business owner you’ll want to make sure that you have a Big Data strategy in place . . . and that Hadoop and NoSQL are part of the solution.



You might also like