Improving Retrieval Augmented Generation with CRAG
Retrieval Augmented Generation (RAG) is a technique that integrates external knowledge sources into large language models (LLMs) to enhance their response generation capabilities. However, one limitation of vanilla RAG systems is that if the initial retrieval of documents is not accurate or relevant, the final response can suffer.
To address this, researchers have proposed a novel method called Corrective Retrieval Augmented Generation (CRAG). The core idea behind CRAG is to introduce a separate "evaluator" component to assess the quality and relevance of the initially retrieved documents before passing them to the LLM for response generation.
How CRAG Works
The system retrieves potentially relevant documents from a vector database based on the user's query.
These retrieved documents are passed through a lightweight evaluator model (e.g., T5) that classifies each document into one of three categories: correct, ambiguous, or incorrect.
Correct documents are kept as-is, while ambiguous ones are supplemented with additional web search results. Incorrect documents are discarded and replaced with web search results.
The retained documents then go through a "decompose and recompose" step, where they are broken into smaller text strips, filtered again for relevance, and only the most pertinent strips are passed to the LLM.
Finally, the LLM uses these highly relevant text strips to generate the final response to the user's query.
Benefits of CRAG
Improved response quality by filtering out irrelevant or incorrect information before generation.
Reduced redundancy in the generated outputs.
More flexibility compared to approaches like Self-RAG that involve fine-tuning the LLM itself.
Easy integration into existing RAG pipelines by adding the evaluator component.
No need to retrain the entire LLM when tasks or datasets change; only the evaluator may need fine-tuning.
Experiments have shown that CRAG outperforms other state-of-the-art methods, including Self-RAG and proprietary models like ChatGPT, on various question-answering benchmarks.
Curious to delve deeper into this?
Join Professor Mehdi as he delves into Corrective RAG, discussing the technique, its pros and cons for production, in the video below!👇
As language models continue to grow in size and capabilities, techniques like CRAG will become increasingly important to ensure that external knowledge is effectively integrated and filtered, resulting in more accurate and relevant responses.
Subscribe to Our YouTube Channel!
We are kicking off our YouTube channel in the new year, and we invite you on board as we walk you through some of these intricacies about AI, fueled by the feedback from our readers, friends and colleagues!
We want to make our channel about AI for everyone. Similar to this newsletter, we’ll talk about new AI products, the latest trends, the nitty-gritty engineering stuff, career insights for AI enthusiasts, and, of course, one of our favorite topics – the entrepreneurial side of AI - 🥳
we're here to show you how you can ride the AI wave and be your own entrepreneur using the cool tools available in the market.
🛠️✨ Happy practicing and happy building! 🚀🌟
Thanks for reading our newsletter. You can follow us here: Angelina Linkedin or Twitter and Mehdi Linkedin or Twitter.
Source of images/quotes:
• SELF-RAG Explained: Intuitive Guide &...
🗞️ Paper: Corrective Retrieval Augmented Generation https://arxiv.org/pdf/2401.15884
📚 Also if you'd like to learn more about RAG systems, check out our book on the RAG system:
📬 Don't miss out on the latest updates - Subscribe to our newsletter: