GANs & Adversarial Divergence Minimization
Canonical Papers
Generative Adversarial Nets
Read paper →Wasserstein GAN
Read paper →Core Mathematics
Original GAN objective:
At optimum, with optimal discriminator , this minimizes the Jensen–Shannon divergence between model and data.
WGAN replaces JS with Earth-Mover (Wasserstein-1) distance, with Lipschitz constraints on .
Key Equation
Interactive Visualization
Why It Matters for Modern Models
- Adversarial min-max ideas appear in adversarial training and some alignment techniques
- GAN-like training still influential in high-fidelity image/video generation
Missing Intuition
What is still poorly explained in textbooks and papers:
- Why JS divergence leads to vanishing gradients when supports don't overlap, and how Wasserstein distances fix this
- Geometric visualizations of discriminator decision surfaces over latent manifolds