How to get started with NoSQL

Experts say that the world’s data is doubling every two years. This epic increase in Big Data has highlighted the limitations of reliance on traditional forms of data storage and management and focused attention on new methods for addressing the volume and variety and veracity of structured and unstructured data. In these discussion, one of the terms you’ve undoubtedly heard lots of buzz around is the expression “NoSQL.” The market is a formidable one with projected growth forecast to reach $3.4 Billion in 2020, representing a compound annual growth rate (CAGR) of 21% for the period 2015 – 2020.

image

NoSQL is a technical term that means either “No SQL” or “Not only SQL” and represents a paradigm shift away from sole reliance on relational databases. Relational databases (RDMS) emerged in the 1970s and were based on a set of data tables that could be queried and matched based on languages like SQL. These database architectures were ‘structured’ meaning that the data was organized in a uniform format and varied little over time.

The premise behind NoSQL is The Key-Value Pair (KVP) which is a framework where each record has a primary key and a collection of values (bins) associated with that record. The way Key Values work can be demonstrated in this sample chart below:

image

Here you see a ‘key’ in the column and a ‘value’ in the right; notice it can be a string, integer, or the like. Most KVP objects allow you to store any object on the right, because it’s just a value. According to this schema there will always be a unique key for a particular object that needs to be returned. Querying the database for that unique key will return the results back from whichever node has the object.

image

There are any number of applications for NoSQL data storage and processing solutions today, ranging from user profile stores, to ecommerce sites, to mobile applications. The ability to deliver high-volume, high-variety online applications for a fraction of the cost that it took with traditional methods, is also one of the reasons why NoSQL technologies are an appealing solution for smaller organizations with limited budgets.

With this brief overview in mind, we can now begin to explore some of the top NoSQL solutions on the market today. Below we’ve compiled a list that startups and small businesses should seriously look into if they wish to start implementing NoSQL solutions and earning maximum ROI.

  image

MongoLab

Chances are good that you’ve heard of MongoDB. Named after the word “humongous”, MongoDB is an open-source document database that has been around since 2007 and since then has gained a reputation as the world’s most popular NoSQL database. MongoLab is a MongoDB startup that began in 2011 as fully managed cloud database service hosting MongoDB databases and running on cloud providers Amazon, Google, Joyent, Rackspace and Windows Azure. The value of MongoLab is that it has combined the right business (MongoDB) and the right delivery model (PaaS) in a manner that has really appealed to developers in the NoSQL community.

image

The extensibility and scalability of the MongoDB and MongoLab offerings are especially impressive as measured by their growing footprint in the data storage market. MongoDB Inc., in collaboration with Microsoft and MongoLab, announced last summer a fully-managed MongoDB-as-a-Service Add-On offering on the Microsoft Azure store.

image

MarkLogic

MarkLogic is a Silicon Valley-based private software firm founded in 2001 that builds what it calls “the Only Enterprise NoSQL Database” on the market. Starting with its roots in XML databases, the firm has leveraged over a decade of experience in developing solutions for unstructured data, leading it to embrace the “enterprise NoSQL” label. The experience has paid off and the firm is well-funded with a total of $73.6 Million received in 6 rounds from 5 investors – the most recent being over $25 Million in April 2013.

MarkLogic defines its solution as “a document-centric, transactional, search-centric, structure-aware, schema-agnostic, programmatic, high-performance, clustered, database server,” and has received considerable accolades throughout the industry. In 2013 MarkLogic released a new semantics platform called MarkLogic 7, which provides the capability of storing billions of RDF triples that can queried with SPARQL to provide richer, deeper into your data in ways not possible with NoSQL or relational models.

image

Couchbase

A recent funding round of $60 million makes Couchbase one of the emerging and newly popular NoSQL databases on the market today, rivaling MongoDB. One of the key areas where Couchbase has innovated is in the area of providing streamlined scalability and performance for interactive applications at the intersection of what it calls “three interrelated megatrends – Big Data, Big Users, and Cloud Computing.” Cloudbase has also pushed data management for mobile offerings by enabling users to easily synchronize data between mobile devices and the cloud.

Couchbase’s open source technology is available in two versions: a Community Edition that comes without recent bug fixes, and the stable Enterprise Edition for commercial use. Couchbase builds are available for Ubuntu, Red Hat, Windows and Mac OS X platforms.

image

CloudDB

CloudDB is an Apache NoSQL distribution that is open source and focuses on ease of use and on being “a database that completely embraces the web.” CloudDB has achieved popularity in the NoSQL community by leveraging the latest web technologies to create easy data storage and access. For example, with CloudDB you can store your data with JSON documents and access your documents and query your indexes with your web browser, via HTTP. You can also index, combine, and transform your documents with JavaScript. These features have made CouchDB particularly well-suited to modern web and mobile app development. It’s even possible to serve up web apps directly out of CouchDB.

CouchDB’s features are more easily accessible through its built-in administration web interface called Futon, which allows users to manage their databases, view and edit documents, compose and run MapReduce views, and trigger replication between databases.

image

DynamoDB

Amazon’s DynamoDB is a fully proprietary NoSQL database service that leverages Amazon’s immense cloud-computing infrastructure. DynamoDB is the culmination of Amazon’s 15 years of experience in building non-relational databases for its own internal needs, and represents the cloud-based version of this technology designed for external customers.

With DynamoDB all you have to do is create the database table and the service does the rest. As you scale up there is no need for hardware or software provisioning, setup and configuration, software patching, operation of a distributed database cluster, or the need to partition data over multiple instances. DynamoDB is unique in that it works on the principle of “throughput” rather than storage. Based on this model, the Amazon service will ensure that DynamoDB allocates the machine resources to meet your throughput needs along with the guarantee of consistent, low-latency performance.

image

With the massive growth expected in the Internet of Things market and the enormous Big Data sets this will produce, startups and small businesses would be well advised to start looking seriously at real world NoSQL use cases and taking measures to adopt the latest benefits of this technology. The alternative will mean getting left behind in the market and swallowed up by your competition. So go ahead and get onboard with NoSQL today; you’ll be glad you did!

 

You might also like