Domain Neighborhood
Information Theory
How we measure information and mismatch between distributions: entropy, cross-entropy, KL divergence, mutual information, and why they appear everywhere in ML.
Recommended Route
Start here, then follow the prerequisites forward.
This sequence is ordered for learning rather than inventory: lower difficulty, fewer prerequisites, and more central concepts come first.
- 01KL Divergence (Relative Entropy)
KL divergence is a directional expected log-probability mismatch between distributions; it explains cross-entropy training, variational inference, and KL-regularized alignment.
14 mincodedemoafter Distributions, Cross-EntropyCheck Distributions first if the symbols feel slippery.
All Published Notebooks