Start with motion, shape, and analogy.
Each concept opens with a mental model you can carry before the notation starts.
Geometry, routing, density, and flow before formalism.
Study Dot ProductContinuous Function
Learn one frontier-AI route first: how attention math becomes KV-cache memory, long-context pressure, and serving tradeoffs through concepts, code witnesses, and a prediction-first lab.

Start Here
Follow one source-scoped path from attention math to KV-cache pressure, then repair the next concept before browsing the full atlas.
Mem_KV = B * N_layers * T * H_kv * d_head * 2 * bytesDomain Atlas
Vectors, matrices, and linear maps: the language of representations, optimization, and modern deep learning.
Rates of change and accumulation. Calculus is the language behind gradients, optimization, continuous-time dynamics, and why backprop works as efficiently as it does.
How we train models: gradients, learning rates, curvature, and the practical tricks that make deep nets converge.
The classical supervised-learning spine: models, losses, generalization, evaluation, and the experiment habits that make modern AI results trustworthy.
Uncertainty made precise: events, random variables, expectations, and the distributions that models learn.
How we measure information and mismatch between distributions: entropy, cross-entropy, KL divergence, mutual information, and why they appear everywhere in ML.
The sequence model backbone: tokenization, self-attention, positional encodings, and the transformer block that powers modern LLMs.
Embeddings and the geometry of meaning: similarity, normalization, contrastive objectives, and why vector spaces become usable interfaces for models.
How models generate: likelihood, latent variables, diffusion/score models, flows, and the training tricks that make sampling work.
How loss and capability change with parameters, data, and compute; how to allocate a training budget; and why some abilities appear suddenly at scale.
How we shape model behavior: preference learning, reward modeling, KL-regularized fine-tuning, and the failure modes that appear when you optimize the wrong thing.
How we make models cheaper to train and serve: quantization, distillation, low-rank adapters, sparsity, and the memory/latency tradeoffs that dominate real deployments.
How models run in production: prefill vs decode, KV cache memory, batching and scheduling, and the techniques that make latency and throughput practical.
The engineering discipline around trustworthy model use: evaluation pipelines, dataset and model versioning, monitoring, drift, reproducibility, and operational tradeoffs.
Editorial Method
The site is not an archive of notes. It is a repeatable notebook format for turning abstract ML ideas into something you can reason through from first contact to implementation.
Each concept opens with a mental model you can carry before the notation starts.
Geometry, routing, density, and flow before formalism.
Study Dot ProductDefinitions, symbols, and derivations stay close to the intuition instead of replacing it.
Derivatives, KL, vector spaces, and chain rule done step by step.
Study Chain RuleThe symbols on the page become short NumPy or PyTorch fragments you can actually run.
Code mirrors the math instead of hiding it behind frameworks.
Study Adam OptimizerInteractive diagrams turn abstract machinery into something you can stress-test, poke, and break.
Attention, diffusion, routing, and serving concepts become explorable.
Study Scaled Dot-Product Attention & Transformer LayersCurated Entries
You can browse the full atlas, but the fastest way in is to follow one editorial thread from prerequisites to modern applications.
Build the minimum mathematical language needed to understand optimization and modern model training.
Open track domainMove from attention mechanics to the engineering decisions that make long-context inference work.
Use the foundations to step into generation, alignment, and inspectable model representations.
First Route Readiness
A new learner should see one concrete path: map or search a claim, inspect the route graph, open Efficient Attention, predict the KV-cache lever, review the claim boundary, and leave a reproduction or repair note attached to the exact object.
Start the local trail or open the route graph.
No benchmark result, hosted compute, automatic expert review, or live runtime-performance claim is made by this v0.
Paper Mapper
Clickable equation
Mem_KV = B * N_layers * T * H_kv * d_head * 2 * bytesThe mapper turns the bottleneck into a calculator, then links the symbols back to attention and serving.
Example route note
The paper trades exact key-value retention for a bounded decode-time memory budget. Learn attention first, test the memory curve, then inspect where retrieval quality can break.
Reader Lenses
Start from a path, predict the demo, then ask the companion to repair the exact gap.
Start foundationsUse concept pages as small, testable models before jumping back to papers or experiments.
Inspect a bridgeUse the same page as a lecture spine: intuition first, then math, code, and manipulation.
Open a lessonLive Proof Loop
Continuous Function helps a serious learner move from a claim or equation to prerequisite repair, a runnable toy witness, a scoped caveat, and one next question without losing the thread.
Paste a specific equation, architecture name, arXiv ID, or short excerpt to find the prerequisite path in the atlas.
The graph places the paper inside concepts, equations, prior work, and the ideas the reader should repair first.
Small authored examples turn the central mechanism into sliders, curves, tensors, and scoped failure cases.
Discussion prompts attach to the exact concept, equation, lab, or paper claim so the next conversation has a clear anchor.
First Launch Route
The first complete module should help students, engineers, and researchers understand transformer inference from the attention equation to serving tradeoffs.
Open Attention To ServingAsk Beside The Notebook
Breaks down notation, prerequisites, equations, demos, and the current learner question.
Maps a pasted claim, equation, arXiv ID, or model-report excerpt to concepts, sources, and caveats.
Points to authored browser demos and small local examples that test one mechanism at a time.
Separates what the toy witness shows from what depends on sources, scale, or expert review.
Ask About This Object
Continuous Function should feel like a mathematical playground, a serious course, a code lab, and a patient tutor in the same place. The companion helps learners ask sharper questions, recover missing prerequisites, turn notation into code, and test understanding against the exact equation or live demo in view.
Object Companion
Choose a path, study a notebook, manipulate the demo, then use the selected object to explain, quiz, connect, or debug the idea.
You are my AI learning companion for Continuous Function. Current context: homepage learning atlas. Learning surface: Continuous Function atlas. What this page says: Choose a path, study a notebook, manipulate the demo, then use the selected object to explain, quiz, connect, or debug the idea. Suggested next step: Pick a track and ask the companion to turn it into a short plan.. Learner goal: Understand the idea. Learner comfort level: New to this. Preferred explanation style: Visual first. Task: Ask me 5 quick questions, then recommend the best Continuous Function learning path for my current level. Answer in a way that helps me learn: ask one clarifying question only if needed, use intuition before notation, and end with one thing I should try on the page.