Build Your Own Hybrid Search

•November 26, 2025

0

Abhishek Thakur

Abhishek Thakur•Nov 26, 2025

Why It Matters

Hybrid search blends lexical precision with semantic understanding, delivering markedly better relevance at web‑scale speeds—an essential capability for businesses that rely on fast, accurate information retrieval.

Summary

The video walks viewers through constructing a scalable hybrid search engine on the Vespa platform, merging traditional BM25 lexical matching with modern semantic vector search. By extending a prior BM25‑only implementation that handled ten‑million documents in sub‑100‑millisecond latency, the presenter upgrades the system to incorporate hierarchical navigable small world (HNSW) indexing and a two‑phase ranking pipeline, enabling fast approximate nearest‑neighbor (ANN) queries over text embeddings.

Key technical steps include adding a tensor‑type "text_embedding" field (384‑dimensional vectors from a Sentence‑Transformers model), configuring HNSW with angular distance, and defining rank profiles for pure semantic search and a "fusion" profile that combines BM25 scores with embedding closeness via reciprocal rank fusion. The global phase ranking re‑scores the top 1,000 candidates, balancing lexical relevance with semantic similarity while keeping computational costs low.

The presenter demonstrates the impact using a corpus of 1.3 million documents. For queries like "pineapple pizza" or "difference between ChatGPT and Gemini," the BM25‑only results return loosely related items, whereas the hybrid fusion consistently surfaces more accurate, context‑aware results—e.g., a direct comparison of Google Gemini and ChatGPT. The UI showcases sub‑10‑millisecond response times even with the added semantic layer.

The approach illustrates that hybrid search can deliver both speed and relevance at scale, making it viable for enterprise search, e‑commerce, and knowledge‑base applications where pure keyword or pure vector search falls short. By leveraging Vespa’s modular ranking phases and ANN indexing, organizations can improve user satisfaction and reduce infrastructure overhead while handling millions of documents.

Original Description

In this video, we upgrade our simple BM25 search engine into a Hybrid Search system using BM25 + vector embeddings with reciprocal rank fusion.

We start from the existing bm25.py file and step through every change to build hybrid.py, explaining each line in simple language.

What you will learn:

✔️ What hybrid search means

✔️ How to add embedding fields

✔️ How to use HNSW for vector similarity search

✔️ How to build semantic ranking

✔️ How to combine BM25 and vector ranking using Reciprocal Rank Fusion (RRF)

✔️ A clean step-by-step walkthrough

✔️ UI to search

Code is available here: https://github.com/abhishekkrthakur/search

Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)

Follow me on:

Twitter: https://twitter.com/abhi1thakur

LinkedIn: https://www.linkedin.com/in/abhi1thakur/

Kaggle: https://kaggle.com/abhishek

0

Comments

Want to join the conversation?

Loading comments...