Similarity Search, Part 4: Hierarchical Navigable Small World (HNSW)

Discover how to construct efficient multi-layered graphs to boost search speed in massive volumes of data

Published in

Towards Data Science

13 min readJun 16, 2023

Similarity search is a problem where given a query the goal is to find the most similar documents to it among all the database documents.

Introduction

In data science, similarity search often appears in the NLP domain, search engines or recommender systems where the most relevant documents or items need to be retrieved for a query. There exists a large variety of different ways to improve search performance in massive volumes of data.

Hierarchical Navigable Small World (HNSW) is a state-of-the-art algorithm used for an approximate search of nearest neighbours. Under the hood, HNSW constructs optimized graph structures making it very different from other approaches that were discussed in previous parts of this article series.

The main idea of HNSW is to construct such a graph where a path between any pair of vertices could be traversed in a small number of steps.

Hi Vyacheslav, really appreciate the way you explained HNSW with such simplicity and great detail.

One doubt: You mentioned that the algorithm descends to the next layer and each of found efConstruction nodes acts as an entry point; but in the…

Hi Vyacheslav, this is such a great blog. I tried to do research on FAISS but I couldn't get as deep as you, and explain as simple as you! Thank you for writing!

A small detail regarding FAISS is that, they fix all the random seeds, so that everytime you do search, its deterministic :)

Hi Vyacheslav, thank you for your excellent article. Should this be "smaller" instead of "greater"? Since in your example, we are trying to find the closest node to X compared to the distance from neighbour node B to other vertices.

Similarity Search, Part 4: Hierarchical Navigable Small World (HNSW)

Discover how to construct efficient multi-layered graphs to boost search speed in massive volumes of data

Introduction

Create an account to read the full story.

Published in Towards Data Science

Written by Vyacheslav Efimov

Responses (7)