Large Language Model (LLM)
A large language model is a type of AI system trained on massive amounts of text that can understand, summarize, translate, and generate human language with remarkable fluency.
What is Large Language Model (LLM)?
The name gives you the shape of the thing. Large refers to scale — both the volume of text these models are trained on and the number of internal settings, called parameters, the model uses to make sense of it all. Language refers to what it works with — words, sentences, paragraphs, conversations, code. Model refers to the mathematical system that has learned the patterns and structure of human language by processing billions of words drawn from books, websites, code repositories, and more. At the heart of every modern large language model is an architecture called the transformer, which allows the model to weigh relationships between words across an entire passage of text rather than processing them one at a time.
Training a large language model works by having the system process enormous quantities of text and repeatedly practice predicting the next piece of text — known as a token — in a sequence. Over billions of these predictions, and billions of corrections when it gets them wrong, the model builds up a deep internal representation of how language works. By the time training is complete, it has developed something that functions like a broad, flexible understanding of the world as expressed through text. It does not think or reason the way a human does, but it has seen so many examples of human reasoning written down that it can produce outputs that look remarkably similar.
What makes large language models significant is their versatility. The same model that can turn a rough idea into a polished essay can also write Python code, translate between languages, answer questions, draft emails, and hold a coherent conversation — without being built specifically for any of those tasks. This general-purpose capability is what separates them from earlier, narrower AI systems that were built to do one thing well. ChatGPT, Claude, and Gemini are all built on large language models, which is why they feel so different from the chatbots and voice assistants that came before them.
Real-world example
In one sitting, a startup founder asks the same large language model to sharpen a clunky product description, translate it into German for a new market, write the spreadsheet formula that totals last quarter's sales, and draft a careful reply to an investor's email. It moves from polishing prose to translation to code to diplomacy without being switched out or retrained between tasks. That range from a single system is exactly what sets a large language model apart from the narrow, single-purpose tools that came before it.
Related terms
Suggested courses
Curated pointers to take this further — these are suggestions, not reviews of courses we've completed.
Frequently asked questions
What is the difference between a large language model and ChatGPT?
A large language model is the underlying technology — the AI system trained on text that can understand and generate language. ChatGPT is a product built on top of one, specifically OpenAI's GPT series of large language models. Think of the large language model as the electricity and ChatGPT as a specific appliance that runs on it. Claude, Gemini, and other AI assistants are also products built on their own large language models.
Do large language models actually understand what they are saying?
This is genuinely debated. Large language models do not understand language the way humans do — they have no lived experience, no awareness, and no intentions. What they do have is an extraordinarily detailed statistical picture of how language works, built from processing more text than any person could read in a thousand lifetimes. Whether that constitutes understanding depends on how you define the word, and reasonable people — including AI researchers — disagree.
How is a large language model different from a search engine?
A search engine finds and retrieves existing content — it returns links to pages where the answer might live. A large language model generates a response based on patterns it learned during training. By default, it is not fetching a page from the internet — it is constructing an answer from what it has internalized. That said, many modern AI tools connect the underlying model to live web search to pull in fresh information. Even then, it is the language model that reads those results and composes the response, rather than simply returning a list of links.