Forget RAG, welcome Agentic RAG 𝗡𝗮𝘁𝗶𝘃𝗲 𝗥𝗔𝗚 In Native RAG, the most common implementation nowadays, the user query is processed through a pipeline that includes retrieval, reranking, synthesis, and generation of a response. This process leverages retrieval and generation-based methods to provide accurate and contextually relevant answers. 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗥𝗔𝗚 Agentic RAG is an advanced, agent-based approach to question answering over multiple documents in a coordinated manner. It involves comparing different documents, summarizing specific documents, or comparing various summaries. Agentic RAG is a flexible framework that supports complex tasks requiring planning, multi-step reasoning, tool use, and learning over time. 𝗞𝗲𝘆 𝗖𝗼𝗺𝗽𝗼𝗻𝗲𝗻𝘁𝘀 𝗮𝗻𝗱 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 - Document Agents: Each document is assigned a dedicated agent capable of answering questions and summarizing within its own document. - Meta-Agent: A top-level agent manages all the document agents, orchestrating their interactions and integrating their outputs to generate a coherent and comprehensive response. 𝗙𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗮𝗻𝗱 𝗕𝗲𝗻𝗲𝗳𝗶𝘁𝘀 - Autonomy: Agents act independently to retrieve, process, and generate information. - Adaptability: The system can adjust strategies based on new data and changing contexts. - Proactivity: Agents can anticipate needs and take preemptive actions to achieve goals. Applications Agentic RAG is particularly useful in scenarios requiring thorough and nuanced information processing and decision-making. A few days ago, I discussed how the future of AI lies in AI Agents. RAG is currently the most popular use case, and with an agentic architecture, you will supercharge RAG!
RAG bothers me in principle. First, consider how much investment and effort is going into developing the foundational model/LLM. You need RAG because the core/backend/main AI system is inadequate -- it cannot correctly or comprehensively solve the problem we want the AI system to solve. Then, ask how much human effort, knowledge of the correct answer/solution, and additional effort by the developers need to expend in building/testing/evaluating a good RAG system and how generic that effort is to solve the problem such that you do not have to constantly tinker to keep up with all the changing circumstances, retraining, change in data and knowledge/world model, etc it takes! The more sophisticated the RAG system, the more costly and complicated the effort.
this is a good example BUT it is missing a crucial point: branching out to tools and functions. because that is the ultimate superpower in those workflows. let the semantoc search kernels and llms do what thy do best, but branch out to tools and function calls whenever there is computation required, that exceeds the capabilities of them. like calling business apis of erps, initiating processes in other systems, calculations, anything like that. that's the real superpower of the agentic approach, to get the best of both worlds: ai power and integration into existing it architecture landscape with business applications beyond mere text processing.
My opinion is that, generally, each of these is an anti-pattern given the availability of SOTA LLM’s with large context windows. A network of autonomous actors, each with its own tools is my preference. An advantage of this is that search operations fall under the same abstractions as any other tools and the actors which do it become fungible. I wonder, if python had better support for easy asynchronous parallelism, would these chain based designs be so common?
Agentic RAG systems offer significant advancements over traditional RAG architectures, but they also present certain challenges in production environments: 1. Increased System Complexity, 2. Higher Computational Costs, 3. Latency and Performance Issues, 4. Error Management Challenges, 5. Development and Maintenance Difficulties, 6. Increased Costs for Enterprises, 7. Potential Delays for End-Users, While Agentic RAG offers enhanced capabilities, it’s essential to consider these factors when implementing it in production settings.
Why I always see things like "features and benefits" and never "features and dengers"? I mean, I think agentic RAG is awesome and it is the future (already the present for some applications) but why we never widespread also the dangers of new technologies? Responsible implementations is pivotal. Pros and cons technology knowledge is a must.
Quite soon all workflows will be agentic. There is no reason that any process needs to be linear. Each step will get smarter, process more deeply, and be self analytical. It’s time to start upgrading.
Adding knowledge graphs (GraphRAG) is another line of work. Pure RAG alone is only 80% reliable for information retrieval. We use it for machine translation systems and terminology was always the issue (aka: consistently retrieving and applying the same word all the time). In time, we settled for RAG agents in a hybrid approach. Agents query, verify and re-ensure terminology is applied consistently (see our BYD Use Case https://pangeanic.com/use-cases/byd-auto-japan). It is the same for documents /information retrieval
Thanks! While promising, I believe it’s important to acknowledge that deploying this kind of agentic RAG in production would be a total nightmare for 99% of enterprises. Each step has a non-negligible potential for failure, and monitoring and debugging present significant challenges—not to mention the latency. Furthermore, I believe a well-executed naive RAG is generally OK for most use cases.
VP of Product - AI Platform @IBM
1moHere’s a tutorial in Python on how to build a LangChain agentic RAG system using the Granite-3.0-8B-Instruct model https://github.com/ibm-granite-community/granite-snack-cookbook/blob/main/recipes/AI-Agents/Agentic_RAG.ipynb