Vector Search Is Not All You Need
Introduction
Retrieval Augmented Generation (RAG) has revolutionized open-domain question answering, enabling systems to produce human-like responses to a wide array of queries. At the heart of RAG lies a retrieval module that scans a vast corpus to find relevant context passages, which are then processed by a neural generative module — often a pre-trained language model like GPT-3 — to formulate a final answer.
While this approach has been highly effective, it’s not without its limitations.
One of the most critical components, the vector search over embedded passages, has inherent constraints that can hamper the system’s ability to reason in a nuanced manner. This is particularly evident when questions require complex multi-hop reasoning across multiple documents.
Vector search refers to searching for information using vector representations of data. It involves two key steps:
- Encoding data into vectors
First, the data being searched is encoded into numeric vector representations. For text data like passages or documents, this is done using embedding models like BERT or RoBERTa. These models convert text into dense vectors of continuous numbers that…