Similarity Search, Part 4: Hierarchical Navigable Small World (HNSW)
Discover how to construct efficient multi-layered graphs to boost search speed in massive volumes of data
Similarity search is a problem where given a query the goal is to find the most similar documents to it among all the database documents.
Introduction
In data science, similarity search often appears in the NLP domain, search engines or recommender systems where the most relevant documents or items need to be retrieved for a query. There exists a large variety of different ways to improve search performance in massive volumes of data.
Hierarchical Navigable Small World (HNSW) is a state-of-the-art algorithm used for an approximate search of nearest neighbours. Under the hood, HNSW constructs optimized graph structures making it very different from other approaches that were discussed in previous parts of this article series.
The main idea of HNSW is to construct such a graph where a path between any pair of vertices could be traversed in a small number of steps.