Legacy Concept Lab
Score Matching & Score-Based Generative Models
Score functions are the mathematical foundation of diffusion models—the denoiser learns the score at each noise level
#36Score MatchingGenerative Models
key equation
s(x) = \nabla_x \log p(x)Phase 4: Generative modeling familiesConcept 36 of 100
Why It Matters for Modern Models
- Score functions are the mathematical foundation of diffusion models—the denoiser learns the score at each noise level
- Explains why diffusion training is "just regression": predict noise ε, which equals -σ × score
- Unifies VAEs, diffusion, and energy-based models through the lens of learning ∇log p(x)
What Tutorials Skip
What is still poorly explained in textbooks and papers:
- The score is a vector field pointing "uphill" toward higher density—sampling follows this flow backward from noise
- Why denoising works: optimal denoiser predicts E[x|x̃], and its gradient w.r.t. x̃ gives the score
- Score matching avoids computing intractable partition functions—you only need gradients, not absolute probabilities
Interactive Visualization
Core Math (Optional Deep Dive)
If you want intuition first, start with the key equation and the visualization. Come back here for the full walkthrough.
Key Equation
The score function is the gradient of log-density:
Score matching learns without knowing the normalizing constant:
Denoising score matching (practical form):
For Gaussian noise , the optimal score is .