Loss Function
Last updated June 11, 2026
What is Loss Function in simple terms?
In simple terms, a loss function is a wrongness score for an AI. It measures how far off each guess is from the right answer, giving training a single number to push down — the lower, the better.
What is Loss Function?
A loss function is a formula that measures how far an AI model's prediction is from the correct answer, producing a single number that training works to make as small as possible, and defining what counts as a good or bad result.
A loss function is how a learning system measures its own mistakes. Every time a model makes a prediction during training, the loss function compares that prediction to the known correct answer and outputs a single number — the loss — that captures how wrong the model was. A big number means a bad miss; a small number means it was close; zero would mean perfect. This score is the signal the whole training process is built around: the model's goal, in effect, is to make this number as small as it can across all its examples.
What makes the loss function so important is that it defines what 'good' even means for a model — and that's a real design choice, not an afterthought. By choosing how to score mistakes, you tell the model what to care about. A loss function that punishes large errors far more harshly than small ones pushes the model to avoid big misses above all; one that treats every error equally produces different behavior. In tasks where one kind of mistake is more costly than another, the loss function is where that priority gets written in. Get it wrong and the model will dutifully optimize for the wrong thing.
The loss function is the target that learning aims at, and it works hand in hand with the methods that do the aiming: gradient descent and backpropagation use the loss to figure out how to adjust the model's internal weights, nudging them to reduce the score a little with each step. So the loss function provides the 'how wrong, and in which direction,' while those methods carry out the corrections. Every trained AI model — from a simple predictor to a giant language model — has a loss function at the heart of its training, quietly defining what it was trying to get right.
Real-world example of Loss Function
A power company is training a model to predict each day's electricity demand so it can plan how much to generate. Its loss function measures the gap between the model's predicted demand and the actual demand that turned out to be needed — a guess that's miles off scores a high loss, a near-perfect guess scores near zero, and training works to shrink that score across thousands of past days. But there's a twist that shows why the loss function is a design choice: under-predicting demand risks a blackout, while over-predicting just wastes some fuel. So the engineers deliberately pick a loss function that penalizes shortfalls more harshly than overshoots. The model, trying to minimize that score, learns to err on the side of predicting a little high — exactly the cautious behavior the company wants, baked in through how mistakes are scored.
Related terms
Frequently asked questions about Loss Function
What is the difference between a loss function and gradient descent?
The loss function measures how wrong the model is — it produces the score. Gradient descent is the method that uses that score to improve the model, adjusting its internal settings to make the loss smaller. One defines the target; the other does the work of moving toward it. Put simply, the loss function says how far off you are and in which direction, and gradient descent takes the actual steps to reduce it. They work together in every round of training: compute the loss, then use it to update the model.
How does a loss function work?
During training, the model makes a prediction and the loss function compares it to the known correct answer, calculating a single number that represents how far off it was — larger for bigger mistakes, near zero for good ones. This number is then used to adjust the model's internal weights so that next time the error is a little smaller. Repeated across enormous numbers of examples, this steadily lowers the loss and improves the model. Different loss functions score mistakes differently, which shapes what the model learns to prioritize.
What is a loss function used for?
It's used to train AI models by giving them a precise, measurable definition of success to optimize. Every trained model has one at the core of its training, from simple predictors to large language models. Beyond just enabling learning, the choice of loss function lets engineers shape a model's behavior — emphasizing certain kinds of accuracy, penalizing costly mistakes more than minor ones, or balancing competing goals. It's both the engine of learning and the place where what the model is really trying to achieve gets defined.