Most Companies Use LLMs Wrong. Here’s Why
LLMs (Large Language Models) are a key artificial intelligence technology powering multiple natural language processing applications. The goal is to create bots that can answer user questions in various contexts by cross-referencing numerous knowledge sources. Unfortunately, the nature of LLM technology introduces unpredictability in LLM responses.
Known challenges of LLMs include:
LLM training data tends to be hopelessly out-of-date
LLM could create a response from non-authoritative sources included in a training data and often could not point which sources were used
LLMs extrapolate when facts aren’t available, so they confidently make false but plausible-sounding statements when there's a gap in their knowledge (called hallucination)
LLM can create inaccurate responses due to terminology confusion, wherein different training sources use the same terminology to talk about different things.
The Retrieval Augmented Generation (RAG) technique was introduced to overcome these problems. Originally, this method was developed by Meta Research team and introduced in this paper [1]. Authors claim that this approach significantly outperforms traditional LLMs:
We fine-tune and evaluate our models on a wide range of knowledge-intensive NLP tasks and set the state of the art on three open domain QA tasks, outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
In short, RAG is an architectural approach that can improve the efficiency of large language model (LLM) applications by leveraging custom data. This is done by retrieving data/documents relevant to a question or task and providing them as context for the LLM.
This means that RAG technique augments LLMs pretrained knowledge with relevant and current information retrieved from external knowledge bases. This dynamic augmentation lets LLMs overcome the limitations of static knowledge and generate responses that are more informed, accurate, and contextually relevant.
High-level RAG process could be presented by the following diagram [2]:
Here, we have a new component called orchestrator, which uses particular tools to access external knowledge sources to obtain information relevant to current context and to enhance the context with that information. Therefore LLM is fed not only with initial prompt, but with an extended context and could use its reasoning logic to generate authoritative, complete answer.
To quote an expert [3]:
It’s like the difference between an open-book and a closed-book exam. In a RAG system, you are asking the model to respond to a question by browsing through the content in a book, as opposed to trying to remember facts from memory.
The tools that can be used in RAG could be of any kind. For example, a tool could connect to external PostgreSQL database and provide a database schema in order to determine which particular table contains required information and then execute LLM-prepared SQL statement against the database to retrieve that data. Or, a tool can hook up to a hotel booking system and make reservations after chatting with customers. Popular RAG-related frameworks support hundreds of ready-to-use tools, ranging from web search to code interpreters or MS Office integrations [4] and allow the creation of new ones easily.
From an architectural point of view, the RAG pattern is one of the methods of customizing LLM applications to a specific domain. These methods are:
Crafting specialized prompts to guide LLM behavior (prompt engineering)
Combining an LLM with external knowledge retrieval (RAG)
Adapting a pretrained LLM to specific datasets or domains (fine-tuning)
Training an LLM from scratch (pre-training).
Their relative complexity of them can be presented like this [5]:
This means that RAG could be considered to be one of the most appropriate ways to accommodate the LLM applications to particular organization needs.
There are many different use cases for RAG. The most common ones are:
Incorporating RAG with LLM-powered chatbots allows them to automatically derive more accurate answers from company documents and knowledge bases. Chatbots are used to automate customer support and website lead follow-up to answer questions and resolve issues quickly.
Search engines already using LLMs could significantly gain from using RAG-based augmentation tools. This is currently one of top developments which are underway in almost all popular search providers.
Knowledge engine — ask questions about your data (e.g., HR and compliance documents). Company data can be used as context for LLMs and allow employees to get answers to their questions easily, including HR questions related to benefits and policies and security and compliance questions.
Here is the list of some practical RAG implementations for different industries [6]:
Finance https://arxiv.org/abs/2311.11944
Legal https://arxiv.org/abs/2309.11325
Medicine https://arxiv.org/html/2402.13178v2
Healthcare https://arxiv.org/abs/2403.09226
Technology https://arxiv.org/html/2404.00657v1
Agriculture https://arxiv.org/abs/2401.08406
Pharmaceuticals https://arxiv.org/html/2402.01717v1
Telecommunications https://arxiv.org/abs/2404.15939
Energy https://arxiv.org/abs/2406.06566
Science https://arxiv.org/abs/2403.15729
Education https://arxiv.org/abs/2406.07796
Construction https://arxiv.org/abs/2402.09939
Sport https://arxiv.org/abs/2406.01280
Real Estate https://arxiv.org/abs/2310.01429
RAG is currently the best-known tool for grounding LLMs on the latest, verifiable information, and lowering the costs of having to constantly retrain and update them. RAG depends on the ability to enrich prompts with relevant information contained in vectors, which are mathematical representations of data. But RAG is imperfect, and many interesting challenges remain in getting RAG done right.
Your turn: How could RAG transform your business? Share your most significant pain points or missed opportunities, and let’s explore how RAG can solve them. The best insights will be featured in my next post. Let’s brainstorm and innovate together 👇
Ready for digital excellence? WislaCode Solutions's software development expertise has empowered leading companies. Let’s collaborate. 🚀 DM me.
[1] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Meta Research, 2021
[2] Source: https://aws.amazon.com/what-is/retrieval-augmented-generation/
[3] Source: What is retrieval-augmented generation?
[4] List of LangChain tools: https://python.langchain.com/docs/integrations/tools/#all-tools
[5] Source: Retrieval Augmented Generation
[6] Source: Retrieval-Augmented Generation (RAG) Examples and Use Cases
Help you make right strategic moves every day | Strategy consultant and board member. Guiding startups and mature companies to better strategic decisions.
2moIt's a bit complicated for me but interesting! I am sure many businesses will find it very useful
Transforming Challenges into Strategic Growth | Specialist in Operations, Finance, and Technology for Global and Family-Owned Businesses | EMBA (IMD)
2moGreat article! In my view, one of the foundational steps in crafting a robust AI strategy for companies is to establish structured data pipelines that capture and vectorize their unique industry knowledge using methods like RAG. By doing so, companies can create a secure and scalable foundation for converting their vast knowledge repositories into actionable insights. This base can then be further enriched by LLMs and external data sources to drive valuable outcomes for the organization. In the grain trading industry, companies accumulate immense amounts of unstructured data—from contracts and emails to trading transactions and market insights. By leveraging RAG, they can ensure this knowledge is systematically captured. The focus shifts from mere data security to exploring how this can be harnessed for tangible business impact. Initial value can be realized through efficiency gains in supply chain operations, purchasing, procurement, ... enabling people to enhance productivity by integrating AI into daily workflows. Starting this process early, companies can gain a competitive edge, as the future of competition will increasingly depend on whether businesses empowered by AI can outpace those that fail to leverage its potential.
CEO&Founder at WislaCode | Software Solutions | Fintech, Mobile, Payments, Banking | EMBA (IMD, Switzerland) | Former C-level Executive in Banks
2moI'm glad you're enjoying this post! It's fantastic to see such enthusiasm, and I'm excited to see that you request more content and suggest new topics for future posts. It's also motivating to receive ideas on related services, and I'll reply to each one. However, it’s interesting that many of you message me directly rather than liking or commenting here. Engagement through likes and comments is essential for post visibility and lets the rest of the audience read your valuable thoughts. Don’t hesitate - it benefits everyone!