Pindrop introduces voice deepfake detection tool, tracks down Harris spoof source
Pindrop has announced the launch in preview of its new Pulse Inspect deepfake detection product.
A release says Pulse Inspect can detect AI-generated speech in any digital audio file with 99 percent accuracy, providing strong protection against voice cloning, voice deepfakes and other voice-based fraud techniques. The tool expands the Atlanta-based voice authentication firm’s Pulse product line, launched in February, to offer new use cases in media, social media platforms, nonprofits and government agencies – significantly broadening its initial focus on call centers.
Pindrop has already demonstrated its deepfake detection bona fides, having identified the text-to-speech (TTS) engine used to generate an audio deepfake of Joe Biden that circulated in robocalls ahead of the New Hampshire primaries. Now it has also discovered the source of a would-be parody deepfake of Biden’s would-be successor, Vice President Kamala Harris.
“Our source attribution system identified a popular open-source text-to-speech (TTS) system, TorToise, as the source,” says a blog by Pindrop Chief Product Officer Rahul Sood. “TorToise exists on GitHub, HuggingFace and in frameworks like Coqui. It’s possible that a commercial vendor could be reusing TorToise in their system. It’s also possible that a user employed the open source version.”
“The critical need for robust deepfake detection mechanisms has come through loud and clear in my conversations with dozens of stakeholders across industries,” says Pindrop CEO Vijay Balasubramaniyan. “With Pulse Inspect, we’re empowering organizations to safeguard the integrity of their content, and bring trust back to the modern digital age.”
Pulse Inspect users can upload audio files via Pindrop, which analyzes them to verify if they contain synthetic speech. Instant deepfake scores alert the user to issues and flag specific parts of the file that contain synthetic or deepfake artifacts. Sood says its liveness detection is “designed for continuous assessment, producing a segment-by-segment breakdown and analyzing for synthetic audio every 4 seconds.” This helps to identify audio clips that are only partially made up of falsified or synthetic speech.
Pindrop’s deepfake detection AI model is trained on more than 350 deepfake generation tools, 20 million unique utterances and over 40 languages, covering more than 90 percent of languages spoken online.
In July, the company secured $100 million in debt financing from Hercules Capital to scale its customer base for deepfake detection and voice biometrics.
Article Topics
biometric liveness detection | biometrics | deepfake detection | deepfakes | generative AI | Pindrop | synthetic voice | voice biometrics
Comments