Ablations

When a model gets better, the honest question is which change did it. A single training run usually moves several things at once: the data mix, a new reward term, a bigger batch, some trick someone read about last week. An ablation answers the question by removing one piece at a time and measuring what happens. The name comes from experimental science, where you take a part away to find out what it was doing.

The need is sharpest in reinforcement learning, where pipelines are tangled and results are noisy. Say you add reward shaping and the score jumps. Did the shaping cause it, or the larger batch you changed the same week, or luck? You can't know until you switch the shaping off, hold everything else fixed, and run again. If the gain disappears, the shaping was real. If it survives, look elsewhere.

This only works on top of evals that genuinely separate models. If your test can't tell a small improvement from random variation, an ablation only measures noise, and you'll "learn" that components matter when they don't. Careful ablations run each setting a few times and compare the spread, never single numbers.

The cost is real. Every ablation is another full training run, and you can't afford to test everything, so much of the craft is deciding which knobs are worth the compute and which you can reason about for free. The payoff is knowing which of your ideas did the work, instead of shipping a folk theory dressed up as a result.