What is RAG (Retrieval-Augmented Generation)?
A technique that grounds AI responses in retrieved documents rather than relying solely on training data.
Definition
Retrieval-Augmented Generation (RAG) is a technique that enhances AI responses by first retrieving relevant documents from a knowledge base, then providing them as context when generating an answer. This grounds the AI's output in specific, current information rather than general training data — reducing hallucinations and enabling responses about private or recent information the model wasn't trained on.
Example
A customer support RAG system retrieves the three most relevant help articles for a user's question before generating a response — ensuring the answer is accurate to your specific product, not just generic.
RAG (Retrieval-Augmented Generation) vs fine-tuning: What's the difference?
A technique that grounds AI responses in retrieved documents rather than relying solely on training data.
Fine-tuning bakes knowledge into the model weights permanently. RAG retrieves knowledge at inference time — making it easier to update and cheaper to run.