Local Learning Trail

Start with a path this browser can remember.

Begin the public Attention to Serving route. The route, current question, and next checkpoint stay local to this browser, so the home page can help you continue before any account system exists.

1Pick a route

Start from a curated path instead of a cold catalog.

2Read the invariant

Move from concept to equation to runnable witness.

3Reveal a lab

Commit a prediction, then carry one observation home.

Suggested routeAttention -> Efficient Attention -> RoPE -> FlashAttention -> Long Context -> LLM Serving -> DecodingPublic, local-only, and ready without login.

Interactive AI Learning

Map AI research to intuition, math, and runnable code

Turn a paper claim, equation, architecture, or system tradeoff into prerequisite concepts, visual explanations, toy witnesses, and source-scoped questions without giving up rigor.

14 domains83 live concepts83 code examples70 interactive demos
A research paper flowing into concept routes, equations, and an experiment surface.

Paper Mapper

Turn a paper clue into a route, one equation, and one experiment.

Preview

Clickable equation

Mem_KV = B * N_layers * T * H_kv * d_head * 2 * bytes

The mapper turns the bottleneck into a calculator, then links the symbols back to attention and serving.

AI synthesis

What changed from old work?

The paper trades exact key-value retention for a bounded decode-time memory budget. Learn attention first, test the memory curve, then inspect where retrieval quality can break.

Extract equationsFind prerequisitesOpen a labAsk one question

Reader Lenses

One atlas, different depths of use.

Learner

A guided route through prerequisites, notation, code, and demos.

Start from a path, predict the demo, then ask the companion to repair the exact gap.

Start foundations
Researcher

A quick way to inspect the mechanism, assumptions, and executable witness.

Use concept pages as small, testable models before jumping back to papers or experiments.

Inspect a bridge
Professor

A teachable sequence with visual hooks, derivation checkpoints, and failure regimes.

Use the same page as a lecture spine: intuition first, then math, code, and manipulation.

Open a lesson

Live Proof Loop

Map AI research to a route you can test.

Continuous Function helps a serious learner move from a claim or equation to prerequisite repair, a runnable toy witness, a scoped caveat, and one next question without losing the thread.

01Claim

Map the claim

Paste a specific equation, architecture name, arXiv ID, or short excerpt to find the prerequisite path in the atlas.

02Map

Reveal prerequisites

The graph places the paper inside concepts, equations, prior work, and the ideas the reader should repair first.

03Lab

Run toy witnesses

Small authored examples turn the central mechanism into sliders, curves, tensors, and scoped failure cases.

04Discuss

Ask one sharper question

Discussion prompts attach to the exact concept, equation, lab, or paper claim so the next conversation has a clear anchor.

First Study Module

Attention to serving, end to end.

The first complete module should help students, engineers, and researchers understand transformer inference from the attention equation to production tradeoffs.

Study module

Ask Beside The Notebook

Specific help attached to the route.

Concept Coach

Breaks down notation, prerequisites, equations, demos, and the current learner question.

Paper Mapper

Maps a pasted claim, equation, arXiv ID, or model-report excerpt to concepts, sources, and caveats.

Toy Witness Guide

Points to authored browser demos and small local examples that test one mechanism at a time.

Claim Scope Review

Separates what the toy witness shows from what depends on sources, scale, or expert review.

Status

What is live, next, and deliberately later.

The product stays trustworthy by separating today's source-grounded atlas from future contribution and compute stages.

StageAvailabilityForIncludes
Live nowPublic previewReaders todayPublic notebooks, paper-map preview, concept routes, graph routes, Attention To Serving path, and reviewed claim scope where available.
NextInvited loopCareful testersObject-attached feedback, reproduction-note templates, issue/task handoff, and local/BYO lab instructions.
LaterNot liveMaintained projectsGoverned partner-compute experiments only after the lab loop, review path, budgets, and artifact policy work.

Object-Attached Learning

Guided lessons, rigorous notes, coding practice, and a companion attached to the object you are studying.

Continuous Function should feel like a mathematical playground, a serious course, a code lab, and a patient tutor in the same place. The companion helps learners ask sharper questions, recover missing prerequisites, turn notation into code, and test understanding against the exact equation or live demo in view.

Object Companion

Plan beside the atlas

Choose a path, study a notebook, manipulate the demo, then use the selected object to explain, quiz, connect, or debug the idea.

Context prompt

You are my AI learning companion for Continuous Function. Current context: homepage learning atlas. Learning surface: Continuous Function atlas. What this page says: Choose a path, study a notebook, manipulate the demo, then use the selected object to explain, quiz, connect, or debug the idea. Suggested next step: Pick a track and ask the companion to turn it into a short plan.. Learner goal: Understand the idea. Learner comfort level: New to this. Preferred explanation style: Visual first. Task: Ask me 5 quick questions, then recommend the best Continuous Function learning path for my current level. Answer in a way that helps me learn: ask one clarifying question only if needed, use intuition before notation, and end with one thing I should try on the page.

Editorial Method

Every page follows the same teaching contract

The site is not an archive of notes. It is a repeatable notebook format for turning abstract ML ideas into something you can reason through from first contact to implementation.

01Intuition

Start with motion, shape, and analogy.

Each concept opens with a mental model you can carry before the notation starts.

Geometry, routing, density, and flow before formalism.

Study Dot Product
02Math

Write the objects down precisely.

Definitions, symbols, and derivations stay close to the intuition instead of replacing it.

Derivatives, KL, vector spaces, and chain rule done step by step.

Study Chain Rule
03Code

Match notation to runnable Python.

The symbols on the page become short NumPy or PyTorch fragments you can actually run.

Code mirrors the math instead of hiding it behind frameworks.

Study Adam Optimizer

Domain Atlas

Navigate by mathematical territory

Browse the full atlas
linear-algebra

Linear Algebra

Vectors, matrices, and linear maps: the language of representations, optimization, and modern deep learning.

9 concepts5 demos
Dot ProductMatrix Decompositions: Eigendecomposition, SVD, and Spectral StructureOrthogonality, Projections, and Least-Squares Geometry
calculus

Calculus

Rates of change and accumulation. Calculus is the language behind gradients, optimization, continuous-time dynamics, and why backprop works as efficiently as it does.

6 concepts4 demos
BackpropagationComputation GraphsDerivatives
optimization

Optimization

How we train models: gradients, learning rates, curvature, and the practical tricks that make deep nets converge.

10 concepts4 demos
Adam OptimizerGradient DescentLearning Rate Schedules: Warmup, Decay & Cycling
machine-learning

Machine Learning

The classical supervised-learning spine: models, losses, generalization, evaluation, and the experiment habits that make modern AI results trustworthy.

9 concepts9 demos
Bias-Variance DecompositionClassification Metrics, Thresholds, and CalibrationLinear Regression & Least Squares
probability

Probability

Uncertainty made precise: events, random variables, expectations, and the distributions that models learn.

6 concepts6 demos
Cross-EntropyDistributionsMaximum Likelihood
information-theory

Information Theory

How we measure information and mismatch between distributions: entropy, cross-entropy, KL divergence, mutual information, and why they appear everywhere in ML.

1 concepts1 demos
KL Divergence (Relative Entropy)
attention-transformers

Attention & Transformers

The sequence model backbone: tokenization, self-attention, positional encodings, and the transformer block that powers modern LLMs.

12 concepts11 demos
Scaled Dot-Product Attention & Transformer LayersEfficient Attention at Scale: KV Cache, GQA & FlashAttentionFlashAttention: IO-Aware Attention
representation-learning

Representation Learning

Embeddings and the geometry of meaning: similarity, normalization, contrastive objectives, and why vector spaces become usable interfaces for models.

2 concepts2 demos
Representation Learning & Embedding GeometrySparse Autoencoders: Feature Dictionaries for Mechanistic Interpretability
generative-models

Generative Models

How models generate: likelihood, latent variables, diffusion/score models, flows, and the training tricks that make sampling work.

5 concepts5 demos
Diffusion, Score-Based Models & Flow MatchingFlow Matching & Rectified FlowsNormalizing Flows: Tractable Density via Invertible Transforms
scaling

Scaling

How loss and capability change with parameters, data, and compute; how to allocate a training budget; and why some abilities appear suddenly at scale.

6 concepts6 demos
Overparameterization & Generalization (Double Descent)Scaling Laws & Emergent Abilities
alignment

Alignment

How we shape model behavior: preference learning, reward modeling, KL-regularized fine-tuning, and the failure modes that appear when you optimize the wrong thing.

5 concepts5 demos
Direct Preference OptimizationKahneman-Tversky OptimizationProcess Reward Models: Step-Level Verifiers for Reasoning
efficiency

Efficiency

How we make models cheaper to train and serve: quantization, distillation, low-rank adapters, sparsity, and the memory/latency tradeoffs that dominate real deployments.

5 concepts5 demos
Efficiency: Quantization, Distillation, LoRA & Sparse MoEKnowledge Distillation: Learning from TeachersPruning: Removing Unnecessary Weights
llm-systems

LLM Systems

How models run in production: prefill vs decode, KV cache memory, batching and scheduling, and the techniques that make latency and throughput practical.

6 concepts6 demos
Decoding & Sampling: Temperature, Top-p & Inference-Time ControlLLM Serving at Scale: Prefill, Decode & Continuous BatchingMoE Serving & Scheduling: Token Dispatch, All-to-All, Disaggregated Parallelism
production-ml

Production ML

The engineering discipline around trustworthy model use: evaluation pipelines, dataset and model versioning, monitoring, drift, reproducibility, and operational tradeoffs.

1 concepts1 demos
Evaluation Pipelines

Curated Entries

Start with a thread, not a random page

You can browse the full atlas, but the fastest way in is to follow one editorial thread from prerequisites to modern applications.