Learn how to use OpenSearch to set up a hybrid search system so you can benefit from both text and vector search advantages

Text databases play a critical role in many business workloads, especially in e-commerce where customers rely on product descriptions and reviews to make informed purchasing decisions. Vector search, a method that utilizes embeddings of text to find semantically similar documents is another powerful tool out there. However, due to concerns about the complexity of implementing it into their current workflow some businesses may be hesitant to try out vector search. But what if I told you that it could be done easily and with significant benefits?
In this blog post, I’ll show you how to easily create a hybrid setup that combines the power of text and vector search. This setup will give you the most comprehensive and accurate search results. I’ll be using OpenSearch as the search engine and Hugging Face’s Sentence Transformers for generating embeddings. The dataset I chose for this task is the "XMarket" dataset (which is described in greater depth here), where we will embed the title field into a vector representation during the indexing process.
Preparing the dataset
First, let’s start by indexing our documents using Sentence Transformers. This library has pre-trained models that can generate embeddings for sentences or paragraphs. These embeddings act as a unique fingerprint for a piece of text. During the indexing process, I converted the title field to a vector representation and indexed it in OpenSearch. You can do this by simply importing the model and encoding any textual field.
The model can be imported by writing the following two lines:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embedding = model.encode(text_field)
It’s that simple!
We will create an index named "products" by passing the following mapping:
{
"products":{
"mappings":{
"properties":{
"asin":{
"type":"keyword"
},
"description_vector":{
"type":"knn_vector",
"dimension":384
},
"item_image":{
"type":"keyword"
},
"text_field":{
"type":"text",
"fields":{
"keyword_field":{
"type":"keyword"
}
},
"analyzer":"standard"
}
}
}
}
}
asin – the document unique ID which is taken from the product metadata.
_descriptionvector – this is where we will store our encoded product title field.
_itemimage- this is an image url of the product
_textfield – this is the title of the product
Note that we are using standard OpenSearch analyzer, which knows to tokenize each word in a field into single keywords. OpenSearch takes these keywords and uses them for the Okapi BM25 algorithm. I also took the title field and saved it twice in the document; once in its raw format and once as a vector representation.
I will then use the model to encode the title field and create documents which will be bulked to OpenSearch:
def store_index(index_name: str, data: np.array, metadata: list, os_client: OpenSearch):
documents = []
for index_num, vector in enumerate(data):
metadata_line = metadata[index_num]
text_field = metadata_line["title"]
embedding = model.encode(text_field)
norm_text_vector_np = normalize_data(embedding)
document = {
"_index": index_name,
"_id": index_num,
"asin": metadata_line["asin"],
"description_vector": norm_text_vector_np.tolist(),
"item_image": metadata_line["imgUrl"],
"text_field": text_field
}
documents.append(document)
if index_num % 1000 == 0 or index_num == len(data):
helpers.bulk(os_client, documents, request_timeout=1800)
documents = []
print(f"bulk {index_num} indexed successfully")
os_client.indices.refresh(INDEX_NAME)
os_client.indices.refresh(INDEX_NAME)
Hybrid search implementation
The plan is to create a client which will take input from the user, generate an embedding using the Sentence Transformers model and perform our hybrid search. The user will also be asked to provide a boost level, which is the amount of significance they want to give to either text or vector search. This way, the user can choose to prioritize one type of search over the other. So if for example the user wants the semantic meaning of his query to be taken into account more than the simple textual appearance in the description then he would give vector search a higher boost than text search.
Search
We’ll first run a text search on the index using OpenSearch’s search method. This method takes in a query string and returns a list of documents that match the query. OpenSearch obtains the results for text search by utilizing the Okapi BM25 as the ranking algorithm. Text search using OpenSearch is performed by sending the following request body:
bm25_query = {
"size": 20,
"query": {
"match": {
"text_field": query
}
},
"_source": ["asin", "text_field", "item_image"],
}
Where query is the text written by the user. For my results to come back in a clean manner I added "_source" in order that OpenSearch will only return the specific fields I am interested in seeing.
Since text and vector search’s ranking score algorithm are different we will need to bring the scores to the same scale in order to combine the results. To do that we’ll normalize the scores for each document from the text search. The maximum BM25 score is the highest score that can be assigned to a document in a collection for a given query. It represents the maximum relevance of a document for the query. The value of the maximum BM25 score depends on the parameters of the BM25 formula, such as the average document length, the term frequency, and the inverse document frequency. For that reason, I took the max score received from OpenSearch for each query and divided each of the results scores by it, giving us scores on the scale between 0 and 1. The following function demonstrates our normalization algorithm:
def normalize_bm25_formula(score, max_score):
return score / max_score
Next, we’ll conduct a vector search using the vector search method. This method takes a list of embeddings and returns a list of documents that are semantically similar to the embeddings.
The search query for OpenSearch looks like the following:
cpu_request_body = {
"size": 20,
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "knn_score",
"lang": "knn",
"params": {
"field": "description_vector",
"query_value": get_vector_sentence_transformers(query).tolist(),
"space_type": "cosinesimil"
}
}
}
},
"_source": ["asin", "text_field", "item_image"],
}
Where get_vector_sentence_transformers sends the text to _model.encode(textinput) which returns a vector representation of the text. Also note that the higher your topK results, the more accurate your results will be, but this will increase latency as well.
Interpolate results and apply boost

Now we’ll need to combine the two search results. To do that, we’ll interpolate the results so every document that occurred in both searches will appear higher in the hybrid results list. This way, we can take advantage of the strengths of both text and vector search to get the most comprehensive results.
The following function is used to interpolate the results of keyword search and vector search. It returns a dictionary containing the common elements between the two sets of hits as well as the scores for each document. If the document appears in only one of the search results, then we will assign it the lowest score that was retrieved.
def interpolate_results(vector_hits, bm25_hits):
# gather all product ids
bm25_ids_list = []
vector_ids_list = []
for hit in bm25_hits:
bm25_ids_list.append(hit["_source"]["asin"])
for hit in vector_hits:
vector_ids_list.append(hit["_source"]["asin"])
# find common product ids
common_results = set(bm25_ids_list) & set(vector_ids_list)
results_dictionary = dict((key, []) for key in common_results)
for common_result in common_results:
for index, vector_hit in enumerate(vector_hits):
if vector_hit["_source"]["asin"] == common_result:
results_dictionary[common_result].append(vector_hit["_score"])
for index, BM_hit in enumerate(bm25_hits):
if BM_hit["_source"]["asin"] == common_result:
results_dictionary[common_result].append(BM_hit["_score"])
min_value = get_min_score(common_results, results_dictionary)
# assign minimum value scores for all unique results
for vector_hit in vector_hits:
if vector_hit["_source"]["asin"] not in common_results:
new_scored_element_id = vector_hit["_source"]["asin"]
results_dictionary[new_scored_element_id] = [min_value]
for BM_hit in bm25_hits:
if BM_hit["_source"]["asin"] not in common_results:
new_scored_element_id = BM_hit["_source"]["asin"]
results_dictionary[new_scored_element_id] = [min_value]
return results_dictionary
Ultimately we will have a dictionary with the document ID as a key and an array of score values as a value. The first element in the array is the vector search score and the second element is the text search normalized score.
Finally, we apply a boost to our search results. We will iterate over the scores of the results and multiply the first element by the vector boost level and the second element by the text boost level.
def apply_boost(combined_results, vector_boost_level, bm25_boost_level):
for element in combined_results:
if len(combined_results[element]) == 1:
combined_results[element] = combined_results[element][0] * vector_boost_level +
combined_results[element][0] * bm25_boost_level
else:
combined_results[element] = combined_results[element][0] * vector_boost_level +
combined_results[element][1] * bm25_boost_level
#sort the results based on the new scores
sorted_results = [k for k, v in sorted(combined_results.items(), key=lambda item: item[1], reverse=True)]
return sorted_results
It’s time to see what we have! This is what the complete workflow looks like:

I searched for a sentence "an ice cream scoop" with a 0.5 boost for vector search and a 0.5 boost for text search, and this is what I got in the top few results:
Vector search returned –

Text search returned –

Hybrid search returned –

In this example, we searched for "an ice cream scoop" using both text and vector search. The text search returns documents containing the keywords "an", " ice","cream" and "scoop". The result that came in fourth for text search is an ice cream machine and it is certainly not a scoop. The reason it came in so high is because its title which is "Breville BCI600XL Smart Scoop Ice Cream Maker" contained three of the keywords in the sentence: "Scoop", "Ice", "Cream" and therefore scored highly on BM25 although it did not match our search. Vector search on the other hand, returns results that are semantically similar to the query, regardless of whether the keywords appear in the document or not. It knew that the fact that "scoop" appeared before "ice cream" meant that it would not match as well. Thus, we get a more comprehensive set of results that includes more than documents that mention "an ice cream scoop".
Clearly, if you were to only use one type of search, you would miss out on valuable results or display inaccurate results and frustrate your customers. When using the advantages of both worlds we receive more accurate results. So, I do believe that the answer to our question is that better together has proven itself to be true.
But wait, can better become even better? One way to improve search experience is by utilizing the power of the APU (Associative Processing Unit) in OpenSearch. By conducting the vector search on the APU using Searchium.ai‘s plugin, we can take advantage of advanced algorithms and processing capabilities to further improve the latency and significantly cut costs (for example, $0.23 vs. $8.76) of our search while still getting similar results for vector search.
We can install the plugin, upload the index to the APU and search by sending a slightly modified request body:
apu_request_body = {
"size": 20,
"query": {
"gsi_knn": {
"field": "description_vector",
"vector": get_vector_sentence_transformers(query).tolist(),
}
},
"_source": ["asin", "text_field", "item_image"],
}
All the other steps are identical!
In conclusion, by combining text and vector search using OpenSearch and Sentence Transformers, businesses can easily improve their search results. And, by utilizing the APU, businesses can take their search results to the next level while also cutting infrastructure costs. Don’t let concerns about complexity hold you back. Give it a try and see for yourself the benefits it can bring. Happy searching!
The full code can be found here
A huge thanks to Yaniv Vaknin and Daphna Idelson for all of their help!