← Back to Glossary

RAG (Retrieval-Augmented Generation)

Techniques

A technique that improves AI responses by first searching a knowledge base for relevant information, then using that information to generate a more accurate answer.

Think of RAG like an open-book exam. Instead of relying on memory alone (which might be incomplete or wrong), the AI gets to look up relevant information in a reference library before answering your question. The answer is better because it is based on actual sources.

RAG, which stands for Retrieval-Augmented Generation, is a technique that gives an AI model access to external information it was not trained on. Instead of relying only on what the model memorized during training, RAG first searches a database or collection of documents to find relevant facts, then feeds those facts to the model along with your question so it can give a more accurate, up-to-date, and grounded response.

Here is the problem RAG solves: language models have a knowledge cutoff -- they only know things from their training data, which might be months or years old. They can also "hallucinate," confidently stating things that are not true. RAG addresses both problems by retrieving real, specific documents and using them as a source of truth.

The process works in two steps. First, the "retrieval" step: your question is used to search a knowledge base (which could be a company's internal documents, a product manual, a collection of research papers, or any other information source) and pull out the most relevant pieces. Second, the "generation" step: those retrieved pieces are included in the prompt along with your question, and the model generates an answer based on that specific information.

RAG is extremely popular in business applications. A customer support bot might use RAG to search through help articles before answering a question. A legal assistant might use RAG to find relevant case law. A company's internal chatbot might use RAG to answer questions about HR policies. It is one of the most practical ways to make AI tools useful for specific organizations without needing to fine-tune a whole model.

Real-World Examples

  • *Perplexity searching the web for current information before generating an answer
  • *A company chatbot searching internal knowledge bases to answer employee questions
  • *NotebookLM analyzing uploaded documents and answering questions about them with citations

Tools That Use This

PerplexityFreemiumNotebookLMFree

Related Terms

Vector DatabaseEmbeddingsHallucinationLarge Language Model