Training / Loss Functions

Loss Functions

To improve, a model needs to know how wrong it is. A loss function turns that into one number: small when the prediction is good, large when it's bad.

For a language model, the prediction is a probability for every possible next token. The loss checks how much probability the model put on the token that actually came next. Confident and correct gives a low loss. Confident and wrong gives a high one. The standard choice for this is cross-entropy loss.

That single number is the target for the whole training process. Everything else, backpropagation and gradient descent, exists to make this number go down.

The loss also defines what "good" means, which gives it real power. Change the loss and you change what the model optimizes for. Capabilities like math have a clean signal: the answer was right or wrong. Behavior is harder, because "did this reply feel helpful" isn't a number you can compute directly. Inventing good signals for fuzzy goals is much of the craft of model behavior.