Optical Character Recognition (OCR)

BeginnerComputer Vision

Last updated June 11, 2026

What is Optical Character Recognition in simple terms?

In simple terms, optical character recognition turns a picture of words into real text you can copy and search. Like teaching a camera to read, it takes a photo of a page and gives back editable words.

What is Optical Character Recognition?

Optical character recognition is technology that converts images of text — such as scanned documents, photos, or printed pages — into machine-readable text that a computer can search, edit, and store.

Optical character recognition (OCR) is technology that reads text from images. To a computer, a photo of a page is just colored dots — it has no idea those shapes are letters and words. OCR bridges that gap, looking at the image and working out which characters the marks represent, then handing back actual text that can be searched, copied, edited, and stored. It's the difference between a picture of a document, which a computer can only display, and the document's words, which a computer can actually use.

The task sounds simple but has real pitfalls. Fonts vary enormously, text can be skewed, faded, or photographed at an angle, paper can be stained or creased, and handwriting is harder still than print. OCR has to find where the text is in the image, separate it from backgrounds and pictures, recognize each character despite all this variation, and often use language knowledge to fix likely errors — knowing, for instance, that a word is more likely 'invoice' than a similar-looking jumble. Modern OCR uses deep learning, which has made it dramatically more accurate across messy real-world images, even pulling text out of street photos and curved surfaces.

Optical character recognition is a long-standing and immensely practical branch of computer vision, and it's often the first link in a longer chain — once a document's text has been extracted, other tools can translate it, search it, or pull key facts from it. It's what lets organizations digitize mountains of paper records, turns photographed receipts and forms into usable data, reads text aloud for accessibility, and powers the live translation of signs and menus through a phone camera. Any time text trapped inside an image needs to become text a computer can work with, OCR is doing the reading.

Real-world example of Optical Character Recognition

A small law firm has filing cabinets full of decades of old typed contracts and case files — thousands of paper pages, completely unsearchable. When a lawyer needs to find every document mentioning a particular clause, the only option has been to dig through folders by hand for hours. So the firm scans the lot and runs optical character recognition over the page images. OCR reads each scanned page and converts the typed words into real, searchable text, while leaving the layout intact. Suddenly the entire archive is keyword-searchable: a lawyer types a phrase and instantly pulls up every contract that contains it, across decades of files. The paper hasn't changed, but the words on it are now text a computer can hunt through — which is the whole power of OCR.

Related terms

Frequently asked questions about Optical Character Recognition

What is the difference between optical character recognition and object detection?

Both are computer vision tasks, but they read different things in an image. Object detection finds physical objects — cars, people, animals — and boxes them. Optical character recognition finds and reads text, converting images of letters and words into machine-readable characters. One identifies things; the other identifies writing. They can even work together: a system might first detect that a sign or label is present, then use OCR to read the words on it. The defining job of OCR is turning pictured text into actual, usable text.

How does optical character recognition work?

It locates the text within an image, separates it from backgrounds and graphics, and then recognizes each character despite differences in font, size, angle, and image quality. It often applies language knowledge to correct likely mistakes, favoring real words over similar-looking nonsense. Modern OCR uses deep learning trained on huge numbers of text images, which has made it far more robust at handling messy, real-world inputs — skewed scans, photos, faded print, and even text on curved or cluttered surfaces — than the rigid pattern-matching systems of the past.

What is optical character recognition used for?

It's used to turn text trapped in images into usable digital text. Organizations use it to digitize paper records, books, and archives so they become searchable; businesses use it to extract data from receipts, invoices, and forms automatically; banks use it to read cheques; and accessibility tools use it to read printed material aloud for people who are blind or have low vision. It also powers live translation apps that read signs and menus through a phone camera, and it's frequently the first step before text is translated, searched, or analyzed further.