Legacy Concept Lab

Classifier-Free Guidance in Diffusion

CFG is why Stable Diffusion/DALL-E/Midjourney produce high-quality, on-prompt images

Concept 42 of 100Generative ModelsPhase 9
#42CFGGenerative Models
key equation\tilde{\epsilon}_\theta = \epsilon_\theta(\emptyset) + w \cdot (\epsilon_\theta(c) - \epsilon_\theta(\emptyset))
Phase 9: Advanced architectures & generationConcept 42 of 100

Why It Matters for Modern Models

  • CFG is why Stable Diffusion/DALL-E/Midjourney produce high-quality, on-prompt images
  • The guidance scale is the main user-facing knob for text-to-image quality vs diversity
  • Trains one model that handles both conditional and unconditional generation via dropout on conditioning

What Tutorials Skip

What is still poorly explained in textbooks and papers:

  • CFG extrapolates beyond the data distribution—high guidance can produce unrealistic but more "prompt-adherent" images
  • There is an optimal guidance scale: too low = ignores prompt, too high = artifacts and oversaturation
  • CFG relates to temperature in LLMs: both are post-hoc distribution shaping at inference time

Interactive Visualization

Core Math (Optional Deep Dive)

If you want intuition first, start with the key equation and the visualization. Come back here for the full walkthrough.

Key Equation
ϵ~θ=ϵθ()+w(ϵθ(c)ϵθ())\tilde{\epsilon}_\theta = \epsilon_\theta(\emptyset) + w \cdot (\epsilon_\theta(c) - \epsilon_\theta(\emptyset))

CFG interpolates between conditional and unconditional scores:

ϵ~θ(xt,c)=ϵθ(xt,)+w(ϵθ(xt,c)ϵθ(xt,))\tilde{\epsilon}_\theta(x_t, c) = \epsilon_\theta(x_t, \emptyset) + w \cdot (\epsilon_\theta(x_t, c) - \epsilon_\theta(x_t, \emptyset))

where ww is the guidance scale (typically 3-15 for text-to-image).

Equivalently in score space:

s~(xt,c)=s(xt)+wxtlogp(cxt)\tilde{s}(x_t, c) = s(x_t) + w \cdot \nabla_{x_t} \log p(c|x_t)

Higher ww amplifies the conditioning signal, trading diversity for fidelity.

Canonical Papers

Classifier-Free Diffusion Guidance

Ho & Salimans2022NeurIPS Workshop
Read paper →

Connections

Next Moves

Explore this concept from different angles — like a mathematician would.