Precision and Recall

IntermediateMachine Learning

Last updated June 14, 2026

What is Precision and Recall in simple terms?

In simple terms, precision is "when it says yes, how often is it right?" and recall is "of all the real yeses out there, how many did it catch?" One measures false alarms, the other measures misses.

What is Precision and Recall?

Precision and recall are two metrics for judging a classification model: precision asks how many of the items it flagged were genuinely correct, while recall asks how many of the genuinely correct items it managed to find — and improving one often costs the other.

Precision and recall are two questions you ask of a model that flags things — spam, fraud, a disease, a relevant search result. They sound similar but pull in opposite directions, and that tension is the whole point. **Precision** looks only at the items the model flagged and asks: how many of those were correct? High precision means few false alarms — when it raises its hand, you can trust it. **Recall** looks at all the items that *should* have been flagged and asks: how many did the model actually catch? High recall means few misses — it rarely lets a real one slip past. The trap is treating either alone as "how good is it." A model can score brilliantly on one while quietly failing the other.

The reason you need both is that it's easy to win one by sacrificing the other. Want perfect recall on a disease test? Flag *everyone* as sick — you'll catch every real case (no misses), but precision collapses, because almost all your flags are false alarms. Want perfect precision? Flag only the one case you're absolutely certain of — that flag is correct, but you've missed everyone else, so recall craters. Real models live on the slope between these extremes, and you tune where they sit based on which mistake hurts more. The classic image is a fishing net: a wide net catches every fish you want (high recall) but hauls in boots and seaweed too (low precision); a tiny, ultra-selective net brings up only fish (high precision) but lets most of the shoal escape (low recall).

So which matters more depends entirely on the cost of each error. For a spam filter you lean toward precision — wrongly binning a real email is worse than letting some junk through. For cancer screening or fraud detection you lean toward recall — missing a real case is far more dangerous than a false alarm that triggers a harmless follow-up. Because you usually can't max out both, people often summarize the trade-off in a single combined figure, the F1 score, which rewards a model only when precision *and* recall are both reasonably high. But the F1 number hides the balance; precision and recall reported separately are what actually tell you how a model fails.

Real-world example of Precision and Recall

Imagine a "lost property" search at a huge train station: you ask staff to bring you every black umbrella handed in today, and 40 black umbrellas truly exist in the lost-property room. The staff bring you a pile of 30 umbrellas. You check them: 24 are genuinely black, the other 6 are dark blue or gray — mistakes. Precision is 24 out of the 30 they brought = 80%; that's how trustworthy their pile was. Recall is 24 out of the 40 black umbrellas that actually existed = 60%; that's how much of the real total they found. Notice they could have boosted recall by grabbing every umbrella in the room — they'd have all 40 black ones, but the pile would be mostly wrong colors, tanking precision. The two numbers together tell you something neither tells alone: the pile was fairly clean, but they still missed four in ten.

Related terms

Frequently asked questions about Precision and Recall

What is the difference between precision and recall?

Precision is measured over the items the model *flagged*: of everything it said "yes" to, how many were actually right? It penalizes false alarms. Recall is measured over the items that were *truly* positive: of all the real cases out there, how many did the model catch? It penalizes misses. Precision is about trusting the flags you got; recall is about not missing the ones you should have gotten. They usually trade off — pushing one up tends to pull the other down — which is why both are reported together. **2. Mechanism — How does precision and recall work?**

How does precision and recall work?

Both come straight from the four counts in a confusion matrix. Precision is the correctly flagged positives divided by *everything* the model flagged (correct flags plus false alarms). Recall is the correctly flagged positives divided by *every* case that was truly positive (correct flags plus misses). Each is a simple fraction between 0 and 1, often shown as a percentage. Because most models let you adjust how readily they flag something, you can slide their behavior to favor precision or recall, trading one against the other to suit the task. **3. Application — What is precision and recall used for?**

What is precision and recall used for?

They're used to judge any model that flags or retrieves things, especially when the categories are imbalanced and a single accuracy figure would mislead. Search and recommendation systems care about precision (the results shown should be relevant) and recall (the good results shouldn't be missed). Medical screening and fraud detection lean on recall, because a missed case is costly. Spam and content filters lean on precision, to avoid wrongly blocking legitimate items. Reporting both lets teams tune a model toward whichever error is more acceptable for their situation.