Foundation Model

IntermediateMachine Learning

A foundation model is a large, general-purpose AI model trained on a huge, broad sweep of data, built to serve as a reusable base that can be adapted to many different tasks rather than being made for just one.

What is Foundation Model?

For a long time, building an AI system meant building it for one job. If you wanted to detect spam, translate French, and answer legal questions, you trained three separate models on three separate datasets, each starting from scratch. Foundation models flipped that approach. Instead of one model per task, you train a single very large model on an enormous, broad range of data — so it absorbs a wide, general competence — and then adapt that same model to all sorts of specific jobs afterward. The big, expensive, general training happens once; the specialization happens many times, cheaply, on top of it. The "foundation" in the name is the point: it is the base layer that countless different applications can be built upon.

Most of the AI systems making headlines are foundation models or products built on them. The large language models behind today's chatbots are foundation models for text. There are foundation models trained on images, on audio, on programming code, and on mixtures of these at once. What unites them is breadth and reusability: they are not narrow tools but broad starting points. Once a lab has trained one, others can adapt it to a particular purpose by fine-tuning it on focused data, steering it with carefully written prompts, or connecting it to outside information — without ever paying the enormous cost of training a giant model themselves. That reuse is what makes the current pace of AI products possible; thousands of tools are being built on a relatively small number of underlying foundation models.

This concentration is also why foundation models attract scrutiny that goes beyond the technical. Because so much gets built on top of them, any weakness, bias, or blind spot baked into a foundation model can quietly propagate into every product downstream of it — a problem in the base layer is inherited by everything standing on it. That is the double edge of the foundation-model era: a handful of powerful, widely reused base models now sit underneath a large and growing share of the AI people interact with every day, which makes the whole ecosystem remarkably efficient while also concentrating risk in a very few places.

Real-world example

Imagine an AI lab releases a new foundation model, and within the same month three unrelated startups build on it. One fine-tunes it into a tool that drafts replies to customer emails in a company's house voice. Another wraps it in a friendly interface and sells it as a study tutor for high-schoolers. A third connects it to a hospital's document system to help nurses quickly find policy answers. Three completely different products, three different industries — and not one of those startups trained a large model from scratch. They all stood on the same foundation and adapted it to their needs, which is exactly what a foundation model is for.

Related terms

Frequently asked questions

What is the difference between a foundation model and a large language model?

A large language model is one type of foundation model — the kind specialized for text. "Foundation model" is the broader umbrella: it covers any large, general-purpose model built as a reusable base, including ones trained on images, audio, code, or several kinds of data at once. Nearly every modern large language model you'd actually interact with is a foundation model, but not every foundation model works with language. The terms get used interchangeably mainly because text models are the most visible examples right now.

Why are they called "foundation" models?

Because they're designed to be the base that other things are built on, rather than finished products in themselves. You train the big general model once, then many specific applications are constructed on top of it through fine-tuning, prompting, or other adaptation. The name captures both the usefulness — a strong base supports a lot — and the responsibility, since whatever is built inherits the strengths and weaknesses of the foundation underneath it.

Who makes foundation models, and can anyone build one?

In practice, training a large foundation model from scratch is extremely expensive, needing vast amounts of data, specialized computing hardware, and expertise — so the biggest ones come from a relatively small set of well-funded AI labs and large technology companies. Most other organizations don't build their own; they adapt an existing foundation model to their needs, which is far cheaper and faster. There is also a growing supply of openly available foundation models that smaller teams can download and build on, which spreads access more widely.