FB pixel

Explainer: What is OCR, and how does it work?

Explainer: What is OCR, and how does it work?
 

Optical character recognition is the process of converting an image of text into a machine-readable text format.

The technology was invented to solve issues connected with text editors’ inability to edit, search or count the words in image files.

OCR is particularly relevant as increasing digitalization requires businesses to receive information from print media, which is traditionally harder to store and manage. This includes scans of identity documents, such as passports or driver’s licenses, which also include photos that can be used for biometric identity binding.

Scanning images via OCR eliminates manual intervention and enables the conversion of text images into text data that can later be analyzed by other business software. 

Companies can use the data to conduct analytics, streamline operations, automate processes and enhance productivity.

How does OCR work?

OCR systems comprise both hardware and software components. The hardware is used to physically scan the document, while the software takes care of the analysis of the characters and their translation into machine-readable text.

From a technical standpoint, OCR software transforms the document into a two-color (usually black-and-white) version. The scanned image, or bitmap, is subsequently analyzed for light and dark areas, with the latter identified as characters to be recognized. In contrast, the former areas are classified as background and therefore excluded from further processing. 

The dark areas are analyzed to find either alphabetic letters or numeric digits. This part of the process typically targets characters individually and identifies them using one of two types of algorithms: pattern matching or feature extraction.

Pattern matching isolates a character image (called a glyph) and compares it with a similarly stored glyph. Noticeably, pattern recognition works only in those cases where the stored glyph has a font with similar font and scale to the input glyph. Because of this, the method works best with scanned images of documents that rely on standard fonts.

The second type of algorithm uses feature extraction, a method that breaks down the glyphs into features such as lines, closed loops, line direction, and line intersections. These features are then used to find the best match among the stored glyphs.

After analysis, the system converts the extracted text data into a digital file. The file can also be used to automate the completion of forms. 

Companies using the technology in conjunction with biometrics include OCR Labs, Datatang and Smart Engines.

Article Topics

 |   |   |   | 

Latest Biometrics News

 

Biometric Update Podcast explores identification at scale using browser fingerprinting

“Browser fingerprinting is this idea that modern browsers are so complex.” So says Valentin Vasilyev, Chief Technology Officer of Fingerprint,…

 

Passkeys now pervasive but passwords persist in enterprise authentication

Passkeys are here; now about those passwords. Specifically, passkeys are now prevalent in the enterprise, the FIDO Alliance says, with…

 

Pornhub returns to UK, but only for iOS users who verify age with Apple

In the UK, “wanker” is not typically a term of endearment. However, the case may be different for Pornhub, which…

 

Europol operated ‘shadow’ IT systems without data safeguards: Report

Europol has operated secret data analysis platforms containing large amounts of personal information, such as identity documents, without the security…

 

EU pushes AI Act deadlines for high-risk systems, including biometrics

The EU has reached a provisional agreement on changes to the AI Act that postpone rules on high-risk AI systems,…

 

Meta challenges UK Online Safety Act fines tied to global revenue

Lo and behold: Meta does not want to pay the fines UK regulator Ofcom says are owed to it for…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events