Legacy Concept Lab

Retrieval-Augmented Generation (RAG)

RAG is the dominant paradigm for grounding LLMs in external/updated knowledge

Concept 43 of 100RepresentationsPhase 9
#43RAGRepresentations
key equationp(y|x) = \sum_{d} p(d|x) \cdot p(y|x, d)
Phase 9: Advanced architectures & generationConcept 43 of 100
Migrated:view the updated version in /domainsThis /foundations page is legacy during migration.

Why It Matters for Modern Models

  • RAG is the dominant paradigm for grounding LLMs in external/updated knowledge
  • Explains why vector databases and embedding search became critical infrastructure
  • Separates "what the model knows" from "what the model can access"—enables knowledge updates without retraining

What Tutorials Skip

What is still poorly explained in textbooks and papers:

  • RAG trades model capacity for external memory: smaller models + good retrieval can match larger models
  • Retrieval quality is bottleneck: irrelevant docs hurt more than no docs (noise injection)
  • The "lost in the middle" problem: LLMs struggle to use information from middle of long contexts—retrieval ranking matters

Interactive Visualization

Core Math (Optional Deep Dive)

If you want intuition first, start with the key equation and the visualization. Come back here for the full walkthrough.

Key Equation
p(yx)=dp(dx)p(yx,d)p(y|x) = \sum_{d} p(d|x) \cdot p(y|x, d)

RAG augments generation with retrieved documents:

p(yx)=dtop-kp(dx)p(yx,d)p(y|x) = \sum_{d \in \text{top-}k} p(d|x) \cdot p(y|x, d)

Retrieval uses embedding similarity:

p(dx)exp(sim(Eq(x),Ed(d))/τ)p(d|x) \propto \exp(\text{sim}(E_q(x), E_d(d)) / \tau)

where EqE_q, EdE_d are query/document encoders (often shared, e.g., BERT, Contriever).

Generation conditions on retrieved context:

p(yx,d1,,dk)=tp(yty<t,x,d1,,dk)p(y|x, d_1, \ldots, d_k) = \prod_t p(y_t | y_{<t}, x, d_1, \ldots, d_k)

Canonical Papers

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Lewis et al.2020NeurIPS
Read paper →

Connections

Next Moves

Explore this concept from different angles — like a mathematician would.