Shrinking commands, ageing voices and microphones everywhere: Pindrop’s future for voice biometrics
“Where we want to be is to more or less reproduce banking as it was done 100 years ago when you went into your branch and the banker recognized you and you felt welcome,” said Nikolay Gaubitch, Director of Research at voice authentication specialist Pindrop, in an interview with Biometric Update.
“And you didn’t have to show any documents – they just knew that you are who you are and helped you straight away.”
Pindrop has recently enjoyed a spate of new partnerships with cloud-based call center platforms such as Odigo, Bandwidth and the Google Cloud Partner Advantage Program. While currently in the background of calls routed through customers’ centers, the company hopes for increased brand recognition as it diversifies and grows.
Nikolay Gaubitch discussed how their tech works now and how and where it might in the future (“anything with a microphone”), in both the real world and Metaverse (“are you going to have your own voice?”).
Finding a voice
“Ten years ago we were a little start-up in Atlanta creating a market that didn’t exist,” said Gaubitch, who joined the firm seven years ago. It has now analyzed over five billion calls.
Pindrop became aware of the potential of deepfakes, long before any headlines, said Gaubitch. The firm developed a large engine to detect it which it could then roll into its products. Much more awareness of the kind of tech they offer has developed, along with a better understanding of its need.
The new partnerships are a natural progression according to Gaubitch as Pindrop has become a brand that giants are having to work with.
Artificial call center agents and chatbots could well come to dominate, but at the moment, Gaubitch believes people are more comfortable speaking to humans. It just needs to be managed securely and efficiently.
Low friction clouds
Call centers becoming cloud-based means they can integrate new services from a menu, rather than needing to install updates or upgrades on the premises. This is allowing companies such as Pindrop to learn from their customers and iterate their products.
Biometric voice authentication can lower caller friction by doing away with passwords and even having to repeat set phrases – which are also easier to hack, says Gaubitch. Passive voice biometrics can also speed up caller authentication by reducing the number of questions asked.
Call center security has evolved from one question such as the caller’s mother’s maiden name, then to three or four questions during the knowledge-based authentication (KBA) phase. “Turns out fraudsters are better at answering those questions,” said Gaubitch. His team found that KBA was actually particularly flawed when they discovered that 92 percent of fraudsters passed KBA questions genuine customers passed only 42 percent of the time.
Dropping the number of questions can reduce friction and call time, meaning efficiencies for call centers: “It might not sound like a lot, but when you have tens of millions of calls annually, it actually adds up.”
When fraudsters call
Voice biometrics are increasingly matched with phone printing – listening to acoustics other that then voice and device indicators – along with behavioral biometrics such as how customers enter numbers on their handsets during a call, something that would be captured during enrollment.
Their systems have detected 104 million spoofed calls and saved clients more than US$2 billion in fraud costs.
So what happens when a fraudsters calls? There are a range of APIs that call centers can deploy, says Gaubitch. A traffic-light system categorizes the risk of a call based on a score of 1 to 100 decided within the first 30 seconds.
Customers can determine the thresholds for risk categories and allow a back office of security staff to listen in to flagged calls, with a slight delay.
“There’s valuable intelligence on a fraud call,” says Gaubitch to explain why some organizations may operate a protocol to keep an identified fraudster on the line. Others choose to immediately cut them off.
Age and emotion
Pindrop is continuing to develop and rollout voice age verification tools. The main target is not for establishing a caller is over a certain age, but the company’s bread and butter: fraud detection.
Running in the background, age estimation of the caller can be compared against the known age of the account holder. This is particularly important in protecting the accounts of the elderly who are heavily targeted by fraudsters.
Another service is age estimation can redirect callers of certain ages to specific call handlers who specialize in that age range.
Pindrop also deploys software to predict voice ageing of enrolled customers. “Even within a few years you can detect these differences and a voiceprint may deteriorate over time, so what we have is something to correct for that,” says Gaubitch.
As with infant fingerprints, algorithms make predictions of how a voice will age and check the caller’s voice with those forecasts.
“As a technology, it’s pretty immature,” says Gaubitch on emotion analysis, something other providers are offering call centers to detect aggression and dissatisfaction to try to improve handling and protect staff. “It’s just not reliable.”
Pindrop is also not yet prioritizing speech recognition, language detection or automated transcription bools. “We don’t need to know what language is being spoken to authenticate the voice,” says Gaubitch of their language-agnostic voice engine.
Voice biometrics everywhere
“We’re becoming more and more comfortable with having voice as the interface for all the devices around us,” explains Gaubitch. Pindrop is exploring the opportunities as people are more willing to use their voices to interact with technology.
“We have an entire stream of work on human to computer interactions which basically covers the whole IoT space, where you have conversations with voice assistants, for example,” says Gaubitch, “and wherever that expands.”
For secure use, devices will need to know which person is speaking, who they are and whether they are a real person. Developing their VoiceAPI for devices to meet these requirements has been particularly demanding.
“Obviously the conversation you have with personal assistants is very different to what you have with human beings, so you typically would give voice commands. In principle you could do a voice check on every command,” says Gaubitch. But the difference in the way we speak to objects required a lot of research for the team.
“We had to really improve our voice engine to be able to work with really short phrases,” says Gaubitch. Their authentication can now work with a less than a second of audio, something the Director of Research still finds “superhuman.”
“The places where I see this being useful are the places where you have a natural conversation. So banks, and many already have a microphone there, so why not add that extra confidence?”
Door locks and intercoms could all use voice biometrics via the company’s APIs. While Pindrop does not plan to make physical devices, they are “getting that imagination going with companies that do.”
In the process, the company hopes for growing recognition of its own brand if not visually then aurally: “If every call started with ‘this call is protected by Pindrop’ that would be great.”
Beyond devices and the physical world, the researcher is thinking ahead to life in the metaverse: “the question is are you going to have your own voice there?” Will we keep our voices or generate new ones to match differing personas?
“I think there’ll be a lot of experimentation with these new technologies before we arrive at something that works for everyone.”