February 21, 2017 -
With the growing adoption of voice biometrics for identification and authentication purposes, researchers are investigating into what the contributing factors of users’ voices changing over time.
Pindrop has released new research revealing that voices change significantly with age, even in the short term, which makes the process of voice biometrics-based authentication more difficult to achieve.
One of the greatest challenges of authenticating an aging voice is that every individual’s voice ages uniquely and at a different rate, which means there is no universally accepted factor that can be used in a known authentic recording to counterbalance the aging aspect.
“Voice biometrics aren’t accurate enough on their own. You have to add other factors like spoofing detection and phoneprinting,” said Dr. Elie Khoury, a principal research scientist at Pindrop.
Khoury, who completed a long-term research study on voice aging, recently delivered a presentation on his results at the RSA Conference in San Francisco.
Unlike fingerprints and irises, which stay relatively the same over time, a user’s changing voice can directly affect scoring models and result in false acceptances or rejections.
In a two-year study of 122 participants who were native speakers of English, Dutch, French, German, Spanish, and Italian, Khoury concluded that the expected error rate (EER) of positively identifying a speaker increased drastically over time.
The study found that the EER almost doubled over the two years, as well as proving that not only one trait changes in a speaker’s voice over time.
“There’s a change in the pitch and the speed of the speech. When you compute the score, it will decrease slowly over time,” Khoury said. “That’s what’s risky for voice biometrics. The score should remain as high as possible for a match. Aging can make false detection or rejection go up over time. And the pitch will change multiple times during a lifetime.”
Aside from age, there are several other factors that can contribute to variances over time. Factors such as the speaker’s emotional state, stress levels, health, and vocal effort can all impact the accuracy of voice authentication, Khoury said.
As such, researchers are developing ways to compensate for these factors in order to improve the accuracy of voice models.
One method to improve accuracy is to tweak the threshold for acceptance, based on the amount of time that passes between tests.
Khoury said updating a model frequently can help compensate the effects of voice aging. By studying more than 400 recordings of Barack Obama’s public speeches from the beginning of his first term through the end of the second, Khoury found that reconfiguring the biometric model significantly lowered the impact that voice aging had on the score.
“You can update the model with each new recording, but that’s risky if someone is able to attack the system and compromise the model,” Khoury said.