AI Glossary
By Lee RobinsonI keep meeting the same words: pretraining, attention, KV cache, reward hacking. This is my running list of them, each explained the way I wish someone had explained it to me first. One idea per page, in plain English, with the technical term right after.
The order builds on itself. Foundations come first, then how text turns into numbers, then training, behavior, and serving. You can read it start to finish like a short book, or land on a single page and get what you need. It started as the notes behind my longer post, Understanding AI.
New here? Start with Machine Learning and follow the arrows.
Foundations
The mental model: what a model is and how it learns at all.
- Machine Learning — Software that learns patterns from data instead of being told every rule.
- Neural Networks — Layers of simple units that turn input numbers into a prediction.
- Deep Learning — Neural networks with many layers, which is where the power comes from.
- Parameters — The weights and biases a model tunes during training. The model is its parameters.
Tokens & Embeddings
How text becomes the numbers a model can work with.
Transformers
The architecture behind every modern language model.
Training
How a model learns from data, one tiny adjustment at a time.
- Pretraining — Teaching a model to predict the next token across the open internet.
- Loss Functions — A single number that says how wrong a prediction was.
- Backpropagation — Walking the error backward to find each weight that caused it.
- Gradient Descent — Nudging weights downhill to shrink the loss, millions of times.
- Scaling Laws — The finding that more data, compute, and parameters predictably help.
Fine-Tuning & RL
Turning a raw base model into a useful, aligned assistant.
- Fine-Tuning — Adapting a pretrained model to a task with a smaller, curated dataset.
- Synthetic Data — Training data a model writes for itself, including the games it practices on.
- Reinforcement Learning — Learning from rewards on the model’s own attempts, not fixed answers.
- Rewards — The single number that tells RL which attempts to do more of.
- RLHF — Using human preferences to teach a model what a good answer looks like.
- Reward Models — A model trained to score outputs the way people would.
- Distillation — A small student model learning to copy a larger teacher.
Model Behavior
How tone, persistence, and personality get trained into a model.
- Behavior Rewards — Rewarding how a model works, not only whether the answer was right.
- Credit Assignment — Deciding which of many actions earned a reward that arrives at the end.
- On-Policy vs Off-Policy — Learning from your own attempts versus learning from someone else’s.
- Reward Hacking — A model gaming the metric instead of doing what you meant.
- Elongation — The slow drift toward longer outputs as RL keeps running.
- Alignment Tax — The capability you trade away when you train for safety and style.
- Model Specs — Writing down the intended behavior so training has a target.
- Judges — Code or an LLM that grades outputs to produce training signal.
Inference
Running a trained model in production, and why it gets fast.
- Inference — Using a trained model to generate output from new input.
- Autoregressive Generation — Producing text one token at a time, each conditioned on the last.
- Prefill & Decode — The fast parallel read of your prompt, then the slow word-by-word write.
- KV Cache — Saving past attention work so each token isn’t recomputed from scratch.
- Quantization — Storing weights in fewer bits to fit more and run faster.
- Batching — Serving many requests on one weight read to share the cost.
- Speculative Decoding — A small model guesses ahead, the big model verifies in one pass.
Using Models
The knobs and ideas you meet when you actually build on models.
- Temperature — The dial between focused, repeatable output and creative, random output.
- Chain of Thought — Letting a model think in steps before it answers.
- Prompt Engineering — Writing inputs that reliably get the output you want.
- Multimodal — Models that handle images, audio, and video alongside text.
Evals & Measurement
Turning fuzzy progress into numbers you can trust, and reading the charts honestly.
- Evaluation — Measuring whether a model is good with tests that match real use.
- pass@k — The odds a model solves a task in k tries instead of one.
- Eval Separation — A good eval spreads models apart and ranks them the way you’d expect.
- Ablations — Removing one piece at a time to learn what actually helped.
- Pareto Frontier — The set of models where every gain on one axis costs you another.
- Zig-Zag Charts — When a curve that should climb smoothly wobbles, and what that tells you.