Knowledge Graph

Ask the map what to learn next.

Use the graph as a research-learning instrument: map papers to concepts, inspect typed edges, find prerequisite repairs, and attach equations, labs, claims, and discussion to the exact idea.

23 route nodes24 typed edgesdomain notebook paths

Map a Paper Continue Attention To Serving Search Atlas Browse Domains

A learner question moving along one highlighted path through a larger concept graph. — RouteKeep the learner's question visible while the graph chooses the next concept.

First Route Readiness

First route: attention to serving

A new learner should see one concrete path: map or search a claim, inspect the route graph, open Efficient Attention, predict the KV-cache lever, review the claim boundary, and leave a reproduction or repair note attached to the exact object.

QuestionHow does attention become a serving bottleneck?

Next repairEfficient Attention

Lab stateKV prediction lab live

Knowledge GraphThe graph names the next concept and route objects.

Save or inspect the route, then open search or Efficient Attention.

Next: Search Open Efficient Attention KV prediction checkpoint

01Homenext action 02Paper Mappernext action 03Knowledge Graphnow 04Searchnext action 05Attention To Servingnext action +1 more

Scope labels requiredPreview/local where applicable; claim review before comparison; reproduction notes stay object-attached.

No benchmark result, hosted compute, automatic expert review, or live runtime-performance claim is made by this v0.

Graph Reading

A graph is useful only when it explains the next move.

The point is not to admire complexity; it is to find the shortest honest path from confusion to a concept you can test.

LearnerFind the missing prerequisite.

When a page feels too hard, use incoming edges to locate the idea that should be repaired first.

Try a bridge

ResearcherSpot reusable mechanisms.

Follow shared neighborhoods to see when optimization, probability, or representation ideas are doing the same job.

Search mechanisms

ProfessorTurn edges into a lecture route.

Use the graph to choose the minimum sequence that makes a derivation or demo feel earned.

Open pillars

Learning Route

Ask the graph what to learn next.

Pick a question and the graph turns it into prerequisites, equations, labs, and honest links between ideas. When you arrive from the mapper, your current paper route stays visible.

Answer

A paper claims it compresses the KV cache. What should I inspect?

Follow the runtime path from attention math to serving bottlenecks, then test which memory term is being reduced.

Save and open KV lab Search this routePreview only; your saved route is unchanged.

01Attentioncore equation 02Efficient Attentioncache mechanics 03RoPEposition behavior 04FlashAttentionmemory movement 05Long Contextstress regime 06LLM Servingprefill/decode split 07Decodingruntime loop

Route Engine

Compute the honest next path.

Choose what the learner already knows and a frontier target. The engine runs a weighted shortest-path query over typed concept edges, then names the first missing repair instead of hand-waving across the gap.

Known concepts

Research target

6 nodesRepair Efficient Attention next

weighted cost5.50

Preview only; changing chips does not overwrite your saved route.

01Attentionweighted copying 02Efficient Attentionmemory pressure 03Long Contextstress regime 04SSM Hybridsfixed-state sequence models

05Parallel Scantrainable recurrenceplanned

06State-Space DualityMamba-style bridgeplanned

prerequisiteAttention -> Efficient Attention

You need Q/K/V weighted copying before cache and memory optimizations are meaningful.

invented to fixEfficient Attention -> Long Context

Long context exposes the cost of storing and repeatedly reading all prior keys and values.

same pressureLong Context -> SSM Hybrids

Fixed-state sequence models become attractive when KV memory grows with sequence length.

implementation dependencySSM Hybrids -> Parallel Scan

Recurrent-looking models need parallel sequence computation to train at scale.

paper-specific bridgeParallel Scan -> State-Space Duality

Modern Mamba-style papers often connect recurrence, convolution, and attention-like views.

Typed Edges

Why the route is honest

prerequisiteAttention -> Efficient Attention

Cache tricks only matter after Q/K/V shape and softmax copying are clear.

implementation dependencyEfficient Attention -> LLM Serving

Decode is often bandwidth-bound because it repeatedly reads cached keys and values.

breaks whenRoPE -> Long Context

Position extrapolation changes the geometry of attention scores.

systems tradeoffFlashAttention -> LLM Serving

FlashAttention changes memory traffic, not the attention function itself.

Learning Objects

What anchors this route

paperKV cache compression result

carried from mapper when available

equationMem_KV = B * N_layers * T * H_kv * d_head * 2 * bytes

live in serving module

toy-experimentMHA vs MQA vs GQA calculator

live: KV memory calculator

claimcapacity vs bandwidth vs quality

object question

Research Room

Resolve the route object before moving on

Pick a paper, equation, lab, claim, or misconception. The selected object becomes the saved route focus for the next companion prompt.

papercarried from mapper when available

KV cache compression result

Anchored question

What should we inspect about KV cache compression result before treating this route object as understood?

Local action draftDraft unavailableNeeds a canonical object key

Local action draft

This object needs a content object key before local action drafts can attach to it.

Draft noteNext action

No local draft saved.

Evidence to inspect

Paper metadata, abstract claim, and any pasted source spans
Mapped concepts, equations, and prerequisite repairs
Which claims are source-checked and which remain local-preview only

What would resolve this

The paper contribution is mapped to a concrete mechanism
Unverified author, date, benchmark, and novelty claims are separated from learning claims
The next concept or lab action is specific enough to resume later

Grounded AI handoff

I am working in Continuous Function's research reading room. Object: paper - KV cache compression result Context: carried from mapper when available Anchor id: paper/graph/kv-cache/kv-cache-compression-result Open question: What should we inspect about KV cache compression result before treating this route object as understood? Evidence to inspect: - Paper metadata, abstract claim, and any pasted source spans - Mapped concepts, equations, and prerequisite repairs - Which claims are source-checked and which remain local-preview only What would resolve this: - The paper contribution is mapped to a concrete mechanism - Unverified author, date, benchmark, and novelty claims are separated from learning claims - The next concept or lab action is specific enough to resume later Answer as a careful research tutor: stay source-grounded, separate verified evidence from assumptions, name the relevant math objects, and end with one next action.

paper/graph/kv-cache/kv-cache-compression-result