Regression

IntermediateMachine Learning

Last updated June 14, 2026

What is Regression in simple terms?

In simple terms, regression is AI predicting a number, like a price or temperature. It studies past examples to learn the trend, then estimates the figure for a new case — like reading off a line through scattered dots.

What is Regression?

Regression is a supervised machine learning task that predicts a continuous numerical value — a price, a temperature, an amount — by learning the relationship between input variables and that number from past examples where the answer is known.

Machine learning tasks split, at the top level, into two big families based on what kind of answer you want. If you want to sort things into categories — spam or not, dog or cat, approve or decline — that's classification. If you want to predict a *number* on a sliding scale — how much, how many, how long, how hot — that's regression. The word sounds technical, but the job is one you do informally all the time: given some facts about a thing, estimate a quantity. How much should this used car sell for? How many units will we ship next month? How long until this delivery arrives? Each asks for a number, and regression is the family of methods for predicting it from data.

The way it learns is the same supervised-learning idea behind much of machine learning: you show it many past examples where you already know the answer. To predict house prices, you feed it thousands of past sales, each with details (size, location, age) *and* the price it actually fetched. The model studies how those details relate to price and captures the pattern — in the simplest case, something like a trend line you could draw through a scatter of dots, then extend to estimate where a new point falls. Real regression is usually richer than a single straight line, juggling many input factors at once, but the intuition holds: it learns the shape of the relationship between the inputs and the number, so it can estimate that number for a new case it hasn't seen.

A few things keep this honest. First, regression predicts a *value*, not a certainty — its output is a best estimate that comes with error, and a good practitioner cares as much about how far off it tends to be as about the prediction itself. Second, it captures the patterns in the data it was trained on, which means it can mislead if you push it far outside that range (a model that learned house prices from modest homes will guess badly about a mansion) or if the relationship it "found" is really coincidence rather than cause. It's also easily swayed by extremes — a handful of freak data points can tug the trend line toward them and distort otherwise ordinary predictions — and it leans on the future resembling the past, so a sharp change in conditions can quietly throw it off. Used within its limits, though, regression is one of the most genuinely useful tools in all of machine learning — quietly powering a huge share of the numeric forecasts businesses and scientists rely on.

Real-world example of Regression

A coffee roaster wants to know how many bags to roast for the weekend, because over-roast and the surplus goes stale, under-roast and they turn customers away. They have two years of records: each weekend's sales alongside the things that seemed to matter — the weather, whether there was a local event, the time of year, whether a promotion ran. A regression model learns how those factors have historically pushed sales up or down, and then, given this coming weekend's forecast (warm, a street festival on, no promo), predicts a number: about 240 bags. It won't be exactly right — maybe 225 sell — but it's a far better starting point than a gut guess, and it improves as more weekends are added. That everyday "predict the right quantity from the patterns in past data" is regression doing its job.

Related terms

Frequently asked questions about Regression

What is the difference between regression and classification?

Both are supervised learning, but they predict different kinds of answers. Regression predicts a continuous number on a sliding scale — a price, a temperature, a sales figure — where answers like 240 or 241 are both valid and meaningfully close. Classification predicts a category from a fixed set — spam or not, which disease, approve or decline — where the answer is one label, not a number. A quick test: if the question is "how much / how many / how long," it's regression; if it's "which one / which type," it's classification.

How does regression work?

You give the model many past examples, each with input details and the known numerical answer. The model learns how the inputs relate to that number — in the simplest case, fitting something like a trend line through the data, and in richer cases capturing how many factors combine to push the number up or down. Once trained, you feed it the details of a new case and it estimates the number by applying the relationship it learned. The training process tunes the model to make those estimates as close as possible to the real answers it was shown.

What is regression used for?

It's used anywhere the goal is to predict a quantity: forecasting sales and demand, estimating prices and property values, projecting costs, predicting how long something will take, and modeling scientific measurements like temperature or growth. It's one of the most widely used tools in machine learning and statistics precisely because so many real questions are really "what number should I expect?" — and regression is the direct way to answer them from historical data.