Legacy Concept Lab

Infinite Context Architectures

Turns entire repos/books into "single prompt" territory

Concept 100 of 100EfficiencyPhase 13

#100InfCtxEfficiency

key equationM_{t+1} = \text{Update}(M_t, K_t, V_t)

Phase 13: Cutting-edge 2024-2025 researchConcept 100 of 100

Why It Matters for Modern Models

What is still poorly explained in textbooks and papers:

If you want intuition first, start with the key equation and the visualization. Come back here for the full walkthrough.

Key Equation

M_{t+1} = \text{Update}(M_t, K_t, V_t)

Compressive memory for bounded cost:

\mathrm{Attn}(Q,K,V) = \mathrm{softmax}\left(\frac{QK^\top}{\sqrt{d}}\right)V \quad O(n^2)

Infini-attention: Maintain memory $M$ updated online, cost bounded w.r.t. $n$ .

Ring Attention: Distribute long sequences across devices via blockwise ring communication.

Munkhdalai et al.2024Google

Explore this concept from different angles — like a mathematician would.