Introduction
What are NoSQL databases?
NoSQL stands for "Not only SQL." NoSQL databases provide an alternate way to store data that differs from relational tables.
NoSQL databases provide the flexibility to store structured, semi-structured, and unstructured data. They are good to use when you need to store huge amounts of data, iterate quickly with changing requirements, and scale out.
There are multiple types of NoSQL databases. 4 of the most common NoSQL databases are:
- Key-value databases: Key-value stores; similar to Python dictionaries. Query either by using the key or search through the entire database. The key-value stores tend to be used in memory and use a backing store behind it.
- Document databases: A collection of documents, where each document is JSON or JSON-like format. Each document contains pairs of fields and values. The primary storage is in the storage layer and we cache it out to memory.
- Wide column databases: Similar to relational database tables; the difference is the storage on the backend is different. We can put SQL on top of a wide column database, which makes it very similar to querying a relational database.
- Graph databases: Stores data as nodes (vertices) and relationships (edges). Vertices typically store object information while edges represent the relationships between nodes. We can have a SQL-like query language in our graph databases.
This article will focus on providing an overview of NoSQL graph databases.
Benefits of graph databases
NoSQL graphs have nodes (vertices) and relationships (edges) that allow us to model all kinds of scenarios – from a system of train paths to mapping social media connections, to a network of devices, and more.
We can assign labels to nodes and classify nodes according to those labels. We can also assign attributes (weights) that are in the form of key-value pairs. Relationships can also have labels and attributes, but can also have direction. Direction provides meaning to the relationship; they could be no direction, one-way, or two-way.
With the increase in the volume of data, fast-paced Agile iterations, and the need to scale out, graph databases play a key role in meeting these needs.
- Graph database performance remains constant even as your data grows over time
- Graph database queries output real-time results
- They can perform real-time updates on big data while supporting queries at the same time.
- Graph databases provide the flexibility to rapidly adapt to changing requirements. Quick iterations with changing requirements require the ability to make changes to the existing graph structure without endangering current functionality.
Graph types and structures
It is important to know the different graph types and structures, including shapes, characteristics, and density.
Graph shapes
The main shapes we’ll cover are random, small-world, and scale-free.
Random: A flat shape with no patterns. All nodes have the same probability of being attached to each other.
- Example: Social Security Numbers, births/deaths, retirement
Small-world: This shape has a high degree of local clustering, which leads to short average path lengths. No node is more than a few relationships away. An analogy to think about is when you hear the saying "hey, it’s a small world!"
- Example: LinkedIn—Most people you meet are a second- or third-degree connection
Scale-free: A "hub and spoke" in multiple scales. This obeys the power-law distribution – The change in one quantity results in a relatively proportional change in another quantity.
- Example: A group of zip codes related to each other makes up a county
Graph properties
Connected & Disconnected: A connection is a path between two nodes, regardless of distance. An issue you may observe in graph connection is that disconnected nodes may not be analyzed in most graph algorithms. Similarly, an issue can arise if your graph contains an island of connected nodes that are disconnected from the main graph.

Directed & Undirected: A directed graph means that the graph relationships (edges) have direction. This direction further defines the node relationship by designating a source node and a destination node. If a graph algorithm requires direction, undirected graphs are inappropriate.

Weighted & Unweighted: A weight is a numeric value placed on a relationship. The weights in a graph can be directed or undirected. If a graph algorithm requires weights, unweighted relationships aren’t included.

Acyclic & cyclic: A cyclic graph means that the graph has cycles or paths from a node back to itself. Many common graph algorithms require acyclic graphs; cycles can cause these algorithms to get stuck and repeat forever.

Trees & spanning trees: A tree is an acyclic graph that can be either directed or undirected. A spanning tree is a tree where all nodes are in the graph, and relationships are removed to remove cycles. Since there are multiple options on which relationships to remove in a cycle, there can be multiple spanning trees in one graph.
A minimum spanning tree is the spanning tree that has the minimum cost. If the graph is weighted, then you calculate the cost by the path weights. If the graph is unweighted, then you find the spanning tree that gets to every node with the minimum number of hops.

Graph density
The ratio between the number of edges (relationships) and the maximum number of edges that the graph can contain is called graph density. If a graph has many edges, then the graph is considered more dense. If a graph does not have many edges, it is considered sparse.
The maximum density of a graph would be the case where every node is connected to every node in a graph. If we know the number of nodes, we can calculate the maximum density. The maximum density and actual density of a graph can be calculated as follows:


More often than not, we see graphs that are extremely dense (i.e.; analyzing network traffic or social media). The issue we face when looking at graph algorithms is that with a high level of density, we need to identify and peel off the layers. On the other hand, when we are working with a high level of sparsely, we want to see if we can add relationships based on inferences that we make.
Summary
In this introductory article, we learned that NoSQL graph databases play a key role in handling the increased volume of data, fast-paced Agile iterations, and the need to scale out. We looked at the main types of graphs and structures along with graph shapes, density, and characteristics like connectedness, direction, weights, cyclic/acyclic, and trees. Graph properties are important to understand in order to implement the best structure and algorithms for the purpose of your work.