FB pixel

Explainer: What is OCR, and how does it work?

Explainer: What is OCR, and how does it work?
 

Optical character recognition is the process of converting an image of text into a machine-readable text format.

The technology was invented to solve issues connected with text editors’ inability to edit, search or count the words in image files.

OCR is particularly relevant as increasing digitalization requires businesses to receive information from print media, which is traditionally harder to store and manage. This includes scans of identity documents, such as passports or driver’s licenses, which also include photos that can be used for biometric identity binding.

Scanning images via OCR eliminates manual intervention and enables the conversion of text images into text data that can later be analyzed by other business software. 

Companies can use the data to conduct analytics, streamline operations, automate processes and enhance productivity.

How does OCR work?

OCR systems comprise both hardware and software components. The hardware is used to physically scan the document, while the software takes care of the analysis of the characters and their translation into machine-readable text.

From a technical standpoint, OCR software transforms the document into a two-color (usually black-and-white) version. The scanned image, or bitmap, is subsequently analyzed for light and dark areas, with the latter identified as characters to be recognized. In contrast, the former areas are classified as background and therefore excluded from further processing. 

The dark areas are analyzed to find either alphabetic letters or numeric digits. This part of the process typically targets characters individually and identifies them using one of two types of algorithms: pattern matching or feature extraction.

Pattern matching isolates a character image (called a glyph) and compares it with a similarly stored glyph. Noticeably, pattern recognition works only in those cases where the stored glyph has a font with similar font and scale to the input glyph. Because of this, the method works best with scanned images of documents that rely on standard fonts.

The second type of algorithm uses feature extraction, a method that breaks down the glyphs into features such as lines, closed loops, line direction, and line intersections. These features are then used to find the best match among the stored glyphs.

After analysis, the system converts the extracted text data into a digital file. The file can also be used to automate the completion of forms. 

Companies using the technology in conjunction with biometrics include OCR Labs, Datatang and Smart Engines.

Article Topics

 |   |   |   | 

Latest Biometrics News

 

The UK’s election may spell out the future of its national ID cards

Identity cards are back among the UK’s top controversial topics – thanks to the upcoming elections and its focus on…

 

Challenges in face biometrics addressed with new tech and research amid high stakes

Big biometrics contracts and deals were the theme of several of the stories on that drew the most interest from…

 

Online age verification debates continue in Canada, EU, India

Introducing age verification to protect children online remains a hot topic across the globe: Canada is debating the Online Harms…

 

Login.gov adds selfie biometrics for May pilot

America’s single-sign on system for government benefits and services, Login.gov, is getting a face biometrics option for enhanced identity verification…

 

BIPA one step closer to seeing its first major change since 2008 inception

On Thursday, a bipartisan majority in the Illinois Senate approved the first major change to Illinois Biometric Information Privacy Act…

 

Identity verification industry mulls solutions to flood of synthetic IDs

The advent of AI-powered generators such as OnlyFake, which creates realistic-looking photos of fake IDs for only US$15, has stirred…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Read From This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events