Explainer: What is OCR, and how does it work?
Optical character recognition is the process of converting an image of text into a machine-readable text format.
The technology was invented to solve issues connected with text editors’ inability to edit, search or count the words in image files.
OCR is particularly relevant as increasing digitalization requires businesses to receive information from print media, which is traditionally harder to store and manage. This includes scans of identity documents, such as passports or driver’s licenses, which also include photos that can be used for biometric identity binding.
Scanning images via OCR eliminates manual intervention and enables the conversion of text images into text data that can later be analyzed by other business software.
Companies can use the data to conduct analytics, streamline operations, automate processes and enhance productivity.
How does OCR work?
OCR systems comprise both hardware and software components. The hardware is used to physically scan the document, while the software takes care of the analysis of the characters and their translation into machine-readable text.
From a technical standpoint, OCR software transforms the document into a two-color (usually black-and-white) version. The scanned image, or bitmap, is subsequently analyzed for light and dark areas, with the latter identified as characters to be recognized. In contrast, the former areas are classified as background and therefore excluded from further processing.
The dark areas are analyzed to find either alphabetic letters or numeric digits. This part of the process typically targets characters individually and identifies them using one of two types of algorithms: pattern matching or feature extraction.
Pattern matching isolates a character image (called a glyph) and compares it with a similarly stored glyph. Noticeably, pattern recognition works only in those cases where the stored glyph has a font with similar font and scale to the input glyph. Because of this, the method works best with scanned images of documents that rely on standard fonts.
The second type of algorithm uses feature extraction, a method that breaks down the glyphs into features such as lines, closed loops, line direction, and line intersections. These features are then used to find the best match among the stored glyphs.
After analysis, the system converts the extracted text data into a digital file. The file can also be used to automate the completion of forms.