Retrieval augmented Agents RAA // Advanced RAG + Agents == Better Agents
Agents can use vector spaces (unstructed) + knowledge graphs (semi-structured) + databases as well (structured)
Here’s a summary of the key points from the presentation:
1. Jerry, co-founder and CEO of Llama Index, discusses the future of knowledge assistance and common use cases for LLMs in enterprises.
2. He explains the evolution from basic RAG (Retrieval-Augmented Generation) to more advanced knowledge assistants.
3. Key challenges with basic RAG include naive data processing, limited query understanding, and lack of memory or stateful interactions.
4. Jerry outlines three steps to improve knowledge assistants:
a) Advanced data and retrieval modules
b) Advanced single agent query flows
c) General multi-agent task solvers
5. For data processing, he emphasizes the importance of good parsing, chunking, and indexing. Llama Parse is introduced as a solution for enterprise developers.
6. Advanced single agent flows incorporate agent components like function calling, tool use, query planning, and conversation memory.
7. Multi-agent systems are presented as the next step, offering benefits such as specialization, parallelization, and potential cost savings.
8. Jerry announces the alpha release of Llama Agents, a framework for representing agents as microservices that can work together to solve complex tasks.
9. The presentation concludes with a demo of Llama Agents in a RAG pipeline and an invitation for community feedback on the project.
10. Llama Cloud is mentioned as a solution for enterprises needing high-quality data processing for documents like PDFs with embedded charts, tables, and images.
The overall theme is the progression from basic RAG systems to more sophisticated, multi-agent knowledge assistants that can handle complex queries and tasks in production environments.
Data quality is important
Agentic RAG
Long-context capabilities are essential for large language models (LLMs) to tackle complex and long-input tasks. Despite numerous efforts made to optimize LLMs for long contexts, challenges persist in robustly processing long inputs. In this paper, we introduce GraphReader, a graph-based agent system designed to handle long texts by structuring them into a graph and employing an agent to explore this graph autonomously. Upon receiving a question, the agent first undertakes a step-by-step analysis and devises a rational plan. It then invokes a set of predefined functions to read node content and neighbors, facilitating a coarse-to-fine exploration of the graph. Throughout the exploration, the agent continuously records new insights and reflects on current circumstances to optimize the process until it has gathered sufficient information to generate an answer. Experimental results on the LV-Eval dataset reveal that GraphReader, using a 4k context window, consistently outperforms GPT-4–128k across context lengths from 16k to 256k by a large margin. Additionally, our approach demonstrates superior performance on four challenging single-hop and multi-hop benchmarks.
3.2 Graph Construction To extract nodes from a document D within the LLM’s context limit, we first split D into chunks of maximum length L while preserving paragraph structure. For each chunk, we prompt the LLM to summarize it into atomic facts, the smallest indivisible facts that simplify the original text. We also prompt the LLM to extract key elements from each atomic fact like essential nouns, verbs, and adjectives. After processing all chunks, we normalize the key elements as described by Lu et al. (2023) to handle lexical noise and granularity issues, creating a final set of key elements. We then construct each node vi = (ki , Ai), where ki is a key element and Ai is the set of atomic facts corresponding to ki . Finally, we link two nodes vi and vj if key element ki appears in Aj and vice versa.
— -
3.3 Graph Exploration 3.3.1 Agent Initialization Given a graph G and a question Q, our goal is to design an agent that can autonomously explore the graph using predefined functions. The agent begins by maintaining a notebook to record supporting facts, which are eventually used to derive the final answer. Then the agent performs two key initializations: defining the rational plan and selecting the initial node. Rational Plan To tackle complex real-world multi-hop questions, pre-planning the solution is crucial. The agent breaks down the original question step-by-step, identifies the key information needed, and forms a rational plan.