Hyperparameter

IntermediateDeep Learning

Last updated June 14, 2026

What is Hyperparameter in simple terms?

In simple terms, a hyperparameter is a setting you choose before training an AI, like the dials you set on an oven before baking. You pick the temperature and timer up front; the cake itself forms during the bake.

What is Hyperparameter?

A hyperparameter is a setting chosen *before* a machine learning model starts training that controls how the training process runs — such as how fast the model adjusts or how long it trains — as opposed to the internal values the model learns by itself from the data.

When you train a machine learning model, two very different kinds of numbers are in play, and hyperparameters are one of them. The model has internal values it figures out for itself by studying the data — those are usually called its parameters or weights, and you don't set them by hand. Hyperparameters are the settings *you* choose beforehand that govern *how* that learning happens: how big a step the model takes each time it adjusts (the learning rate), how many examples it looks at before each adjustment (the batch size), how many full passes it makes over the data (the number of epochs), and many more. The tell-tale difference is timing and control: hyperparameters are set up front by the person training the model and stay fixed during a training run, while parameters are discovered by the model as it learns.

The oven analogy holds up well. Before you bake, you choose the temperature and the timer — those are your hyperparameters, decided in advance. What actually happens to the cake as it rises and browns is the equivalent of the parameters the model learns inside the oven; you don't shape each crumb by hand. And crucially, getting the settings wrong ruins the result even if everything else is fine: too hot and you burn it, too cool and it stays raw. Hyperparameters work the same way. A learning rate set too high and the model never settles on a good answer; too low and training crawls or gets stuck. The right values aren't obvious in advance and depend on the specific task and data.

Because of that, a large part of the day-to-day craft of machine learning is *hyperparameter tuning* — systematically trying different settings to find a combination that trains a good model. This can be done by hand from experience, by methodically sweeping through ranges of values, or with automated search that tests many combinations and keeps the best. It's often time-consuming and computationally expensive, since each trial may mean training a model more or less from scratch. The honest reality is that there's rarely one perfect set of hyperparameters; tuning is about finding settings that work well enough for the job, and good defaults plus a bit of careful adjustment usually get you most of the way there.

Real-world example of Hyperparameter

Picture someone learning to brew espresso on a new machine. The coffee itself — how the grounds dissolve and the flavors pull through — is the part they can't directly control; it just happens in the cup. What they *can* set beforehand are the dials: how finely to grind, how tightly to pack, how long to run the shot, what temperature to use. Those settings are their hyperparameters, and getting them wrong produces something bitter or sour no matter how good the beans are. So they tweak one dial, taste, adjust, and try again — a finer grind, a few seconds longer — until a run finally tastes right. That patient cycle of adjust-and-retry is exactly hyperparameter tuning: not controlling the brew directly, but dialing in the settings that decide how well it turns out.

Related terms

Frequently asked questions about Hyperparameter

What is the difference between a hyperparameter and a parameter?

It comes down to who sets the value and when. A hyperparameter is chosen by the person before training begins and controls how training runs — the learning rate, the batch size, the number of epochs — and it stays fixed during the run. A parameter (often called a weight) is a value the model figures out for itself *during* training as it learns from the data; nobody sets those by hand. In short, hyperparameters are the dials you set up front to steer the learning; parameters are the result of that learning. Getting the hyperparameters right is what lets the model find good parameters.

How does a hyperparameter work?

A hyperparameter works by shaping the training process rather than the final answer directly. Before training starts, you fix values like the learning rate and the number of epochs; these then govern how the model moves through the data and adjusts its internal parameters. Because the same model and data can train into a great result or a poor one depending purely on these settings, finding good values matters — a process called hyperparameter tuning. That's typically done by trying different combinations, evaluating how well each trained model performs, and keeping the best, sometimes by hand and sometimes with automated search.

What is a hyperparameter used for?

Hyperparameters are used to control and improve how a model is trained. By adjusting them, practitioners steer the trade-offs that decide whether training succeeds: how fast the model learns, how long it trains, how much it's pushed to avoid memorizing the data, and more. Tuning them well is one of the core practical jobs in building any machine learning model, because the same architecture and data can produce a strong or a weak model depending entirely on the settings — so most real projects spend meaningful effort searching for hyperparameters that work for the task at hand.