September 28, 2015 -
A research team at University of Alabama at Birmingham has discovered that voice impersonation can be applied to trick both automated and human verification for voice authentication systems.
The research was authored by UAB graduate students Dibya Mukhopadhyay and Maliheh Shirvanian, researchers in UAB’s Security and Privacy In Emerging computing and networking Systems (SPIES) Lab, along with Nitesh Saxena, Ph.D., the director of the SPIES Lab and associate professor of computer and information sciences at UAB.
The team recently presented the research — which explores how attackers equipped with audio samples of another person’s voice could compromise their security, safety and privacy — at the European Symposium on Research in Computer Security (ESORICS) in Vienna, Austria.
Using an off-the-shelf voice-morphing software, the researchers were able to develop a voice impersonation attack for the purpose of breaching automated and human verification systems.
“Because people rely on the use of their voices all the time, it becomes a comfortable practice,” said Saxen. “What they may not realize is that level of comfort lends itself to making the voice a vulnerable commodity. People often leave traces of their voices in many different scenarios. They may talk out loud while socializing in restaurants, giving public presentations or making phone calls, or leave voice samples online.”
A would-be attacker could record a person’s voice using a few techniques, including being in close proximity to the speaker, conducting a spam call, by scouring the Internet for audiovisual clips, and by hacking into cloud servers that store audio data.
Using software that automates speech synthesis such as voice morphing, allows attackers to create a near-duplicate of an individual’s voice by using just few audio samples.
The technology can then transform the attacker’s voice to state any arbitrary message in the voice of the victim.
“As a result, just a few minutes’ worth of audio in a victim’s voice would lead to the cloning of the victim’s voice itself,” said Saxena. “The consequences of such a clone can be grave. Because voice is a characteristic unique to each person, it forms the basis of the authentication of the person, giving the attacker the keys to that person’s privacy.”
In its research, the UAB team investigated the consequences of stealing voices in two voice authentication-dependent applications and scenarios.
The first application involved a voice biometrics system that uses the so-called unique features of a person’s voice for authentication purposes.
Researchers found that once the study’s participants were able to fool the voice biometrics system by using fake voices, they could gain full access to the device or service.
The second application explored how stealing voices affected human communications, in which the researchers used voice-morphing tool to imitate Oprah Winfrey and Morgan Freeman in a controlled study environment.
The study’s participants were able to make the voice morphing system speak nearly any phrase in the victim’s tone and vocal manner to launch an attack that could potentially jeopardize their reputation and security.
The results clearly showed how automated verification algorithms were largely ineffective in blocking any of the attacks developed by the research team, with the average rate of rejecting fake voices being less than 10 to 20 percent for most victims.
In two online studies with about 100 participants, researchers found that participants rejected the morphed voice samples of celebrities as well as somewhat familiar users about half the time.
“Our research showed that voice conversion poses a serious threat, and our attacks can be successful for a majority of cases,” Saxena said. “Worryingly, the attacks against human-based speaker verification may become more effective in the future because voice conversion/synthesis quality will continue to improve, while it can be safely said that human ability will likely not.”
Saxena made a few recommendations on ways that people can prevent their voice from being stolen, which included increasing their awareness of these potential attacks, and being wary of uploading audio clips of their voices on social media.