Agentic RAG // HyDE, self-query, etc.
Agents are an abstraction that wraps multi-hop reasoning // for better results
RAG uses an LLM to answer user queries based on information retrieved from a knowledge base. This approach has advantages over using a standalone or fine-tuned LLM: it grounds answers in factual information, reduces confabulation, provides domain-specific knowledge, and allows fine-grained control over the knowledge base access.
However, vanilla RAG has limitations, notably that it only performs one retrieval step, which can lead to poor results if the initial retrieval is suboptimal. Additionally, semantic similarity is computed using the user query as a reference, which can be suboptimal if the query’s format differs from that of the relevant documents.
RAG agent can be used. This agent formulates queries itself and can critique and re-retrieve information if necessary, thereby enhancing retrieval performance. It uses advanced techniques such as generating reference sentences closer to the targeted documents and re-retrieving based on generated snippets to improve accuracy.
Agentic AI adds a “reasoner” to RAG. The reasoner doesn’t just pull data; it understands the nuances of the person asking, including the question itself and the context.
In a basic RAG system, there’s no clear plan for why certain data is retrieved. Agentic RAG creates a purpose-driven approach to retrieval.
Here’s what the reasoner does:
It guesses what the user really wants based on their identity
It makes a plan to find and use the right information
It uses context to understand the relative importance and reliability of various data sources
For example, if someone asks about a recent sales update, the reasoner might prioritize real-time communication tools like Slack or Teams over more static sources like CRM entries. This helps find newer, more useful information.