Legacy Concept Lab

Video World Models

Merges generative modeling with dynamics modeling

Concept 97 of 100Generative ModelsPhase 13

#97VideoWMGenerative Models

key equationp_\theta(x_{1:T}|c) = \prod_t p_\theta(x_t | x_{<t}, c)

Phase 13: Cutting-edge 2024-2025 researchConcept 97 of 100

Why It Matters for Modern Models

What is still poorly explained in textbooks and papers:

If you want intuition first, start with the key equation and the visualization. Come back here for the full walkthrough.

Key Equation

p_\theta(x_{1:T}|c) = \prod_t p_\theta(x_t | x_{<t}, c)

Video as learned dynamics. Autoregressive:

p_\theta(x_{1:T}|c) = \prod_{t=1}^T p_\theta(x_t | x_{<t}, c)

Diffusion over latent $z$ :

\min_\theta \mathbb{E}_{t,\epsilon}[\|\epsilon - \epsilon_\theta(z_t, t, c)\|^2]

Video generators = learned simulators of physical world.

OpenAI2024OpenAI

Explore this concept from different angles — like a mathematician would.