Token

BeginnerNatural Language Processing

A token is the small chunk of text — often a whole word, a piece of a word, or a punctuation mark — that an AI language model reads and generates, since these models work in tokens rather than in letters or whole sentences.

What is Token?

When you send text to an AI language model, it doesn't read it the way you do, letter by letter or even quite word by word. First the text goes through a step called tokenization, where it is chopped into tokens — short pieces drawn from a fixed inventory the model was built with. Some tokens are whole common words like "the" or "dog." Others are fragments of longer or rarer words, so "unbelievable" might be split into something like "un," "believ," and "able." Spaces and punctuation get their own tokens too. The model then does all of its reading, reasoning, and writing in these units: it takes in a sequence of tokens and produces its response one token at a time, snapping them together into the text you finally see.

Why break text up like this? It's a practical compromise. Treating every possible whole word as its own unit would need an impossibly huge vocabulary and still trip over new or invented words. Going all the way down to individual letters would make sequences painfully long and strip out useful structure. Tokens sit in the sweet spot — a manageable set of reusable pieces that can spell out anything, including words the model has never seen, by combining fragments. This is invisible to you as a user, but it's happening behind every single interaction with a chatbot.

Tokens matter in everyday ways even if you never think about them directly. They are the unit by which these systems are usually measured and priced: an AI service typically bills per token and limits how many tokens it can handle at once, so "how long is this text" really means "how many tokens is it." As a rough rule of thumb, a token averages around four characters of English, which works out to roughly three-quarters of a word — so a 1,000-word document is very approximately 1,300 tokens, though the exact count varies with the words and, importantly, with the language. Because most AI systems are built with English in mind, text in other languages such as Spanish, Japanese, or Arabic often gets chopped into smaller fragments, so the very same sentence can use noticeably more tokens — frequently two to three times as many, sometimes more — outside English. In practice that means non-English users can face higher costs and shorter effective length limits for what is, conceptually, the identical request. Tokens also quietly explain some of the odd gaps in what these models can do, which is easiest to see with a concrete example.

Real-world example

Ask a chatbot a seemingly trivial question — "How many times does the letter s appear in Mississippi?" — and you may be surprised to see a confident, wrong answer. The reason traces straight back to tokens. The model never saw "Mississippi" as eleven separate letters; it saw it as a couple of tokens, perhaps something like "Miss" and "issippi." Counting individual letters means peering inside those chunks at a level the model doesn't naturally work on, so letter-by-letter tasks — counting characters, spelling a word backward, or other wordplay — are exactly where an otherwise brilliant system can fumble. It isn't that the model is bad at the alphabet; it's that it reads in tokens, not letters.

Related terms

Frequently asked questions

What is a token in ChatGPT and other AI models?

It's the basic unit of text the model works with — usually a word, a part of a word, or a punctuation mark. Before a model can process your message, the text is broken into these tokens, and the model both reads and writes in them, producing its reply one token at a time. You never see this directly, but everything you type and everything the model says back is, under the hood, a sequence of tokens.

How many tokens are in a word?

On average, a little more than one. A common rule of thumb for English is that a token is about four characters, or roughly three-quarters of a word — so 100 words land somewhere around 130 tokens. Short, common words may be a single token, while long or unusual ones get split into several. It also varies a lot by language — text in languages other than English is often split into more tokens, so the same meaning can cost more and fill a model's limit faster outside English. The reason anyone cares is practical: because AI services usually price and limit usage by the token, the token count is what determines cost and whether a long text fits within a model's limit.

Why do AI models struggle to count letters or spell words backward?

Because they read in tokens, not individual letters. A word is often a single token or a few tokens, not a tidy row of separate characters, so anything that requires looking inside a token to work with its letters goes against the grain of how the model processes text. That's why a system that can write you an essay might miscount the letters in a word or mangle a request to reverse it. It's a quirk of tokenization, not a sign that the model is generally unreliable.