Combining vector & “scalar” (key words) search in a RAG pipelines // BM42 & beyond

RAG is a hot (most popular) generative AI application for enterprises today

2 min readJul 4, 2024

https://github.com/qdrant/workshop-ultimate-hybrid-search

Here’s a summary of the key points from the article:

1. BM25 has been the standard algorithm for search engines for 40 years, but its effectiveness is being challenged in the context of modern Retrieval-Augmented Generation (RAG) systems.

2. The article introduces BM42 as a potential evolution of lexical search, combining elements of BM25 and transformer models.

3. BM25’s longevity is attributed to its Inverse Document Frequency (IDF) component, which effectively selects important terms relative to the document collection.

4. However, BM25’s term importance within documents becomes less relevant for RAG systems due to shorter, fixed-length document chunks.

5. SPLADE, a current alternative to BM25, has limitations including tokenization issues, expensive token expansion, domain dependency, and slow inference time.

6. BM42 aims to combine the simplicity of BM25 with the intelligence of transformers while avoiding SPLADE’s pitfalls.

7. BM42 uses transformer attention matrices to determine term importance within documents, replacing BM25’s statistics-based approach.

8. It addresses tokenization issues through WordPiece retokenization and applies traditional NLP techniques to reduce token count.

9. BM42 offers advantages in interpretability, inference speed, memory footprint, and multi-lingual support compared to BM25 and SPLADE.

10. Benchmarks on the Quora dataset show BM42 slightly outperforming BM25 in precision.

11. The article suggests that optimal results are achieved by combining sparse (BM42) and dense embeddings in a hybrid approach.

12. Qdrant, the company behind BM42, encourages further experimentation and development in this area of search technology.

https://blocksandfiles.com/2024/07/02/qdrant-launches-combined-vector-and-keyword-search-for-rag-and-ai-apps/

Open source vector database supplier Qdrant has developed its own BM42 search algorithm combining vector and standard BM25 keyword search methods to get better RAG results, claiming the method lowers cost.

Combining vector & “scalar” (key words) search in a RAG pipelines // BM42 & beyond

RAG is a hot (most popular) generative AI application for enterprises today

Written by sbagency

No responses yet