FB pixel

Researchers find major demographic differences in speech recognition accuracy

Categories Biometric R&D  |  Biometrics News
Researchers find major demographic differences in speech recognition accuracy
 

Research indicates that speech recognition technology from the world’s leading consumer technology brands performs with different degrees of accuracy for different demographics, or as some would say is “biased” against black people.

A team of academics from Stanford University tested automated speech recognition (ASR) systems from Amazon, Apple, Google, IBM, and Microsoft for the paper “Racial disparities in automated speech recognition,” in the Proceedings of the National Academy of Sciences journal, and found that they misidentified roughly 19 percent of words uttered by white people, but word error rate (WER) was 35 percent for speech of black people. Audio snippets from white speakers were considered incomprehensible 2 percent of the time, while for black speakers the systems could not read 20 percent.

To analyze WER for different linguistic groups, the researchers took the Corpus of Regional African American Language (CORAAL) dataset compiled in three U.S. communities and samples from the Voice of California (VOC) dataset. Human experts transcribed interview snippets 5 to 50 seconds long, and their results were compared with those of the machine-learning algorithms from the above-mentioned tech giants.

The researchers propose increasing the diversity of training datasets, and including African American Vernacular English, to reduce performance differences.

Apple had the highest error rates for both datasets, and a WER discrepancy of more than 20 percent. Google and Microsoft had the smallest discrepancies, but both were still over 10 percent, and Amazon’s WER for black speakers was equal to that of Google, but its algorithm was slightly more accurate for white speakers. Microsoft’s system was the only one with a WER for black people below 30 percent.

The findings also include some insight into geographic distribution, as speech collected from black speakers in rural and heavily urban settings (Princeville, North Carolina and D.C.) had higher error rates than speech collected in Rochester, NY.

Two different possible explanations for the differences were explored by the researchers; a gap in the lexicon and grammar of the language models used, such as black people using words not included in the ASR systems, and a performance gap in the systems’ acoustic models.

Words spoken by white and black people were identifiable in the vocabulary of Google’s ASR 98.6 percent and 98.7 percent of the time, however. When phrases with identical text were analyzed, the ASR technology made more errors with samples spoken by black speakers, indicating that differences in pronunciation and prosody, such as rhythm, pitch, syllable accenting, vowel duration, and lenition may be behind the performance differences.

Bias has been a significant issue in facial biometrics, where NIST testing has shown differences in accuracy vary widely between different vendors.

R7 Speech Sciences Co-founder Delip Rao explained in a blog post in 2018 that inherent physiological differences between men and women make it difficult to train AI speech recognition systems to perform as accurately with speech from women.

Voice and speech recognition are expected to make up a $26.8 billion market by 2025.

Article Topics

 |   |   |   | 

Latest Biometrics News

 

Humanity Protocol CEO talks Moongate acquisition, expansion into ticketing

Humanity Protocol has acquired Moongate, marking a move into the ticketing and access market. For Terence Kwok, CEO of the…

 

Half a million shoplifters can’t be right

By Professor Fraser Sampson, former UK Biometrics & Surveillance Camera Commissioner When Napoleon said that we were a nation of shopkeepers,…

 

Fight misinformation with IDV for tiered anonymity on social media, paper argues

Social media and its effects on our society is an ongoing conversation. Some governments are considering banning social media for…

 

Hackathon spotlights role of Philippines national ID in effective service delivery

Institutions that are yet to integrate their services with the Philippines national ID Authentication platform have been called upon to…

 

Sri Lanka promotes outcome-based procurement for a robust digital economy

A significant transformation in Sri Lanka’s public procurement system, is paramount in the journey to advance Sri Lanka’s digital economy,…

 

Private, effective age verification is possible: Australia age assurance technology trial

“Age assurance can be done in Australia and can be private, robust and effective.” This is the key finding of…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events