Paper Mapper

Map claims into learning paths

Map a claim, equation, arXiv link, or short excerpt into concepts; inspect which claims are grounded in the pasted source and Continuous Function pages; then turn the route into equations, toy witnesses, and discussion objects.

source checkconfidence labelssaved in this browsernot live yet actions

Start Mapping Open Serving Module

A paper clue becoming one equation, one prerequisite repair, and a lab target. — MapOne paper clue, one equation, one next experiment.

Input

Paste a claim, equation, or paper clue.

Start with a title, abstract, arXiv link, specific claim, equation, or short excerpt you are allowed to use. The mapper builds a grounded source check, then suggests concepts, equations, labs, and discussion prompts.

Paper title, arXiv link, abstract, or model-report excerptOptional user-provided PDF for page-bounded source lookupNo PDF selectedUse only files you have rights to access; extracted spans are for citation and verification, not full-text republication.

Grounding Status

detectedarXiv-like clue

matched termskv, cache, serving, memory

Static preview grounds against the pasted clue and Continuous Function pages. When live source lookup is connected, arXiv metadata and page-bounded spans can verify author, date, title, abstract, and local source claims server-side.

Source Check

source-check versioncontinuous-function.paper-mapper.v1

arxiv id2405.12345

readyClassify source

Detected arxiv input.

needs live lookupResolve paper metadata

Live source lookup can fetch arXiv metadata for 2405.12345.

readyParse PDF bytes

No PDF parser needed for pasted text.

readyExtract equations

2 equation-like snippets found with line/page spans.

needs live lookupRun source-grounded mapper

Server-side AI should map only from retrieved metadata, extracted equations, and internal concept snippets.

local equation spans

Page 3, line 2Mem_KV = B * N_layers * T * H_kv * d_head * 2 * bytes

Page 4, line 3softmax(QK^T / sqrt(d_k))V = weighted value copy

local source spans

Page 3, line 4

We reduce KV cache memory for long-context LLM serving by sharing or compressing value states.

Mapped Result

KV Cache Compression / Long-Context Serving

High for serving papers that mention KV cache, long context, GQA, MQA, memory, or inference.

The paper is probably about reducing decode-time memory while preserving enough token retrieval quality.

The central question is what information can be dropped, shared, quantized, or recomputed without breaking downstream attention.

Next Best Move

Continue this paper as one study route.

Repair Efficient Attention first, carry one equation, then test the claim in the smallest available lab.

Read firstEfficient Attention

InspectKV memoryMem_KV = B * N_layers * T * H_kv * d_head * 2 * bytes

TryKV memory lab

Compare MHA, MQA, and GQA memory while sweeping context length from 4k to 128k tokens.

Then askWhich KV-cache memory term is this method reducing?

Do not treat author, date, benchmark, or exact novelty claims as verified until source lookup is connected.

Continue route Open KV memory lab

Source Check

What the answer is allowed to use

user inputUser supplied paper clue

KV cache, long-context, serving, memory, inference terms in the pasted text.

concept pageContinuous Function: Attention

Defines the Q/K/V weighted-copy mechanism.

concept pageContinuous Function: Efficient Attention

Connects attention to cache and memory movement.

concept pageContinuous Function: LLM Serving

Frames prefill/decode and runtime bottlenecks.

planned ingestionExternal paper metadata

arXiv/OpenAlex/Semantic Scholar lookup should verify title, authors, date, and equations.

Claim Check

Supported, not overclaimed

high confidenceThe map should inspect decode-time KV memory before architectural novelty.

Sources: User supplied paper clue, Continuous Function: Efficient Attention, Continuous Function: LLM Serving

high confidenceThe core equation to ground is attention over available keys and values.

Sources: Continuous Function: Attention, Continuous Function: Efficient Attention

medium confidenceExternal author/date/venue claims are intentionally withheld until live ingestion is connected.

Sources: External paper metadata

01Attentioncore equation 02Efficient Attentionmemory mechanism 03RoPEposition behavior 04Long Contextfailure pressure 05LLM Servingsystems consequence 06Decodingruntime loop

Carried Equations

Source boxes ready for explanation

local preview

pasted-textPage 3, line 2

bbox pending

AttentionEfficient AttentionRoPELong ContextLLM ServingDecoding

explainer prompt

Explain this equation from the paper step by step. Define every symbol, name the tensor or scalar shapes when possible, say what assumption the equation depends on, and point to the smallest prerequisite repair. Connect it to this concept route: Attention -> Efficient Attention -> RoPE -> Long Context -> LLM Serving -> Decoding. Equation: Mem_KV = B * N_layers * T * H_kv * d_head * 2 * bytes

Read First

Prerequisite repairs

Scaled dot-product attention
KV cache memory growth
GQA/MQA head sharing
prefill vs decode latency

Equation Explainer

KV memory

Mem_KV = B * N_layers * T * H_kv * d_head * 2 * bytes

high confidenceGrounded in Continuous Function: Efficient Attention, Continuous Function: LLM Serving

Memory grows linearly with batch size, layers, cached tokens, KV heads, head dimension, and numerical precision.

Toy Lab

Safe visualization spec

{
  "type": "kv-cache",
  "status": "live",
  "goal": "Compare MHA, MQA, and GQA memory while sweeping context length from 4k to 128k tokens.",
  "outputs": [
    "curve",
    "failure_case",
    "reading_note"
  ]
}

Research Room

Resolve the exact paper object

Pick a paper, claim, equation, or lab object. The room keeps evidence, assumptions, and the AI handoff attached to that object.

paperLocal paper map

arXiv 2405.12345

Anchored question

Which tokens or heads can be compressed without hurting retrieval?

Local action draftDraft unavailableNeeds a canonical object key

Local action draft

This object needs a content object key before local action drafts can attach to it.

Draft noteNext action

No local draft saved.

Evidence to inspect

Source ids to inspect: input
Paper metadata, abstract claim, and any pasted source spans
Mapped concepts, equations, and prerequisite repairs
Which claims are source-checked and which remain local-preview only

What would resolve this

The paper contribution is mapped to a concrete mechanism
Unverified author, date, benchmark, and novelty claims are separated from learning claims
The next concept or lab action is specific enough to resume later

Grounded AI handoff

I am working in Continuous Function's research reading room. Object: paper - arXiv 2405.12345 Context: Local paper map Anchor id: paper/paper-map/kv-cache/arxiv-2405-12345-h-6c7m48 Open question: Which tokens or heads can be compressed without hurting retrieval? Evidence to inspect: - Source ids to inspect: input - Paper metadata, abstract claim, and any pasted source spans - Mapped concepts, equations, and prerequisite repairs - Which claims are source-checked and which remain local-preview only What would resolve this: - The paper contribution is mapped to a concrete mechanism - Unverified author, date, benchmark, and novelty claims are separated from learning claims - The next concept or lab action is specific enough to resume later Answer as a careful research tutor: stay source-grounded, separate verified evidence from assumptions, name the relevant math objects, and end with one next action.

paper/paper-map/kv-cache/arxiv-2405-12345-h-6c7m48