New dataset could aid development of behavioral biometric, synthetic voices
A research group has built a dataset of the physical movements which create the sounds of speech and could one day be used to develop speech recognition systems that synthesize voices for people with speech impairments. It could also lead to a new method of silent speech recognition, and even a new behavioral biometric, according to a paper published in the journal Scientific Data. The database was built using a combination of lip reading and analyzing facial movements.
The group asked 20 volunteers to utter a series of vowel sounds, words, and full sentences while their facial movements and voices were recorded.
The team used channel impulse response data from radio ultra-wideband and frequency modulated continuous wave radars to capture the movement of the skin of participants’ faces as they spoke, as well as the movements of their tongue and voice box.
Researchers used a laser speckle detection system with a high speed camera to capture the vibrations on the surface of their skin. An additional Kinect V2 camera that measures depth was used to read the changes of their mouths as they formed the shapes to produce different sounds.
The research could one day mean that voice-controlled devices like smartphones could read users’ lips in silence, improve the quality of calls in noisy environments, and be used to authenticate banking and other sensitive applications by identifying a users’ unique facial expressions. In other words, the actions that would for many users be analyzed as voice biometrics would be used to authenticate the individual based on the movements of their lips and face.
The database, built by analyzing 400 minutes of speech, will be made available for free to researchers to help with the development of such new technologies.
The research group includes researchers from the University of Dundee and University College London and used technology from the Communications, Sensing, and Imaging Hub at the University of Glasgow.
“Contactless sensing has huge potential for improving speech recognition and creating new applications in communications, healthcare and digital security,” says Professor Muhammad Imran, leader of the Hub, in an announcement.
“We’re keen to explore in our own research group here at the University of Glasgow how we can build on previous breakthroughs in lip-reading using multi-modal sensors and find new uses everywhere from homes to hospitals,” he continued.
Other research groups have worked on databases with voice biometrics to help those with speech impairments.
Article Topics
behavioral biometrics | biometrics research | face biometrics | lip motion | speech recognition | synthetic voice | voice biometrics
Comments