FB pixel

Researchers find major demographic differences in speech recognition accuracy

Categories Biometric R&D  |  Biometrics News
Researchers find major demographic differences in speech recognition accuracy

Research indicates that speech recognition technology from the world’s leading consumer technology brands performs with different degrees of accuracy for different demographics, or as some would say is “biased” against black people.

A team of academics from Stanford University tested automated speech recognition (ASR) systems from Amazon, Apple, Google, IBM, and Microsoft for the paper “Racial disparities in automated speech recognition,” in the Proceedings of the National Academy of Sciences journal, and found that they misidentified roughly 19 percent of words uttered by white people, but word error rate (WER) was 35 percent for speech of black people. Audio snippets from white speakers were considered incomprehensible 2 percent of the time, while for black speakers the systems could not read 20 percent.

To analyze WER for different linguistic groups, the researchers took the Corpus of Regional African American Language (CORAAL) dataset compiled in three U.S. communities and samples from the Voice of California (VOC) dataset. Human experts transcribed interview snippets 5 to 50 seconds long, and their results were compared with those of the machine-learning algorithms from the above-mentioned tech giants.

The researchers propose increasing the diversity of training datasets, and including African American Vernacular English, to reduce performance differences.

Apple had the highest error rates for both datasets, and a WER discrepancy of more than 20 percent. Google and Microsoft had the smallest discrepancies, but both were still over 10 percent, and Amazon’s WER for black speakers was equal to that of Google, but its algorithm was slightly more accurate for white speakers. Microsoft’s system was the only one with a WER for black people below 30 percent.

The findings also include some insight into geographic distribution, as speech collected from black speakers in rural and heavily urban settings (Princeville, North Carolina and D.C.) had higher error rates than speech collected in Rochester, NY.

Two different possible explanations for the differences were explored by the researchers; a gap in the lexicon and grammar of the language models used, such as black people using words not included in the ASR systems, and a performance gap in the systems’ acoustic models.

Words spoken by white and black people were identifiable in the vocabulary of Google’s ASR 98.6 percent and 98.7 percent of the time, however. When phrases with identical text were analyzed, the ASR technology made more errors with samples spoken by black speakers, indicating that differences in pronunciation and prosody, such as rhythm, pitch, syllable accenting, vowel duration, and lenition may be behind the performance differences.

Bias has been a significant issue in facial biometrics, where NIST testing has shown differences in accuracy vary widely between different vendors.

R7 Speech Sciences Co-founder Delip Rao explained in a blog post in 2018 that inherent physiological differences between men and women make it difficult to train AI speech recognition systems to perform as accurately with speech from women.

Voice and speech recognition are expected to make up a $26.8 billion market by 2025.

Article Topics

 |   |   |   | 

Latest Biometrics News


Immigrant and civil rights groups urge govt to ban own use of FRT, limit private use

Rights groups continue to call on the U.S. government to limit governmental use of facial recognition technology. Digital rights group…


Kenya raises issuance targets for digital IDs and passports

Everything being equal, Kenya plans to issue at least three million digital national IDs and one million biometric passports before…


IOM and Japan back biometrics at Sri Lanka ports of entry

Biometric technology use continues to grow at airports around the world. Air transport industry IT provider SITA predicts that by…


Kuwait fingerprints 2M as biometric data registration deadline nears

Kuwait is finalizing its program of collecting fingerprints for the country’s central biometric database as the June deadline for completion….


Xperix OCR software deployed for Brazil border control in deal with Akiyama Group

South Korea-based biometrics provider Xperix Inc. has announced the successful integration of its RealPass-N Optical Character Recognition (OCR) algorithm and…


The UK’s election may spell out the future of its national ID cards

Identity cards are back among the UK’s top controversial topics – thanks to the upcoming elections and its focus on…


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Read From This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events