NoSQL Data Stores

Relational databases store the vast majority of web application persistent data. However, there are several alternative classifications of storage representations.

  1. Key-value pair
  2. Document-oriented
  3. Column-family table
  4. Graph

These persistent data storage representations are commonly used to augment, rather than completely replace, relational databases. The underlying persistence type used by the NoSQL database often gives it different performance characteristics than a relational database, with better results on some types of read/writes and worse performance on others.

Key-value Pair

Key-value pair data stores are based on hash map data structures.

Key-value pair data stores

  • Redis is an open source in-memory key-value pair data store. Redis is often called "the Swiss Army Knife of web application development." It can be used for caching, queuing, and storing session data for faster access than a traditional relational database, among many other use cases. Learn more on the Redis page.

  • Memcached is another widely used in-memory key-value pair storage system.

Key-value pair resources

Redis resources

Document-oriented

A document-oriented database provides a semi-structured representation for nested data.

Document-oriented data stores

  • MongoDB is an open source document-oriented data store with a Binary Object Notation (BSON) storage format that is JSON-style and familiar to web developers. PyMongo is a commonly used client for interfacing with one or more MongoDB instances through Python code. MongoEngine is a Python ORM specifically written for MongoDB that is built on top of PyMongo.

  • Riak is an open source distributed data store focused on availability, fault tolerance and large scale deployments.

  • Apache CouchDB is also an open source project where the focus is on embracing RESTful-style HTTP access for working with stored JSON data.

Document-oriented data store resources

Column-family table

A column-family table class of NoSQL data stores builds on the key-value pair type. Each key-value pair is considered a row in the store while the column family is similar to a table in the relational database model.

Column-family table data stores

Graph

A graph database represents and stores data in three aspects: nodes, edges and properties.

A node is an entity, such as a person or business.

An edge is the relationship between two entities. For example, an edge could represent that a node for a person entity is an employee of a business entity.

A property represents information about nodes. For example, an entity representing a person could have a property of "female" or "male".

Graph data stores

  • Neo4j is one of the most widely used graph databases and runs on the Java Virtual Machine stack.

  • Cayley is an open source graph data store written by Google primarily written in Go.

  • Titan is a distributed graph database built for multi-node clusters.

Graph data store resources

NoSQL third-party services

  • Compose provides MongoDB as a service. It's easy to set up with either a standard LAMP stack or on Heroku.

NoSQL data store resources

  • NoSQL databases: an overview explains what NoSQL means, how data is stored differently than in relational systems and what the Consistency, Availability and Partition-Tolerance (CAP) Theorem means.

  • NoSQL Explained is a good high-level overview of considerations and features when choosing a type of NoSQL database compared to a relational database.

  • CAP Theorem overview presents the basic constraints all databases must trade off in operation.

  • The CAP Theorem series explains concepts related to NoSQL such as what is ACID compared to CAP, CP versus CA and high availability in large scale deployments.

  • NoSQL Weekly is a free curated email newsletter that aggregates articles, tutorials, and videos about non-relational data stores.

  • NoSQL comparison is a large list of popular, BigTable-based, special purpose, and other datastores with attributes and the best use cases for each one.

  • Relational databases such as MySQL and PostgreSQL have added features in more recent versions that mimic some of the capabilities of NoSQL data stores. For example, check out this blog post on storing JSON data in PostgreSQL.

NoSQL data stores learning checklist

  1. Understand why NoSQL data stores are better for some use cases than relational databases. In general these benefits are only seen at large scale so they may not be applicable to your web application.

  2. Integrate Redis into your project for a speed boost over slower persistent storage. Storing session data in memory is generally much faster than saving that data in a traditional relational database that uses persistent storage. Note that when memory is flushed the data goes away so anything that needs to be persistent must still be backed up to disk on a regular basis.

  3. Evaluate other use cases such as storing transient logs in a document-oriented data store such as MongoDB.

What do you want to learn about next?

Tell me about standard relational databases.

I've built a Python web app, now how do I deploy it?

What are web application programming interfaces (APIs)?


Matt Makai 2012-2022