Klick Labs develops deepfake detection method focusing on vocal biomarkers
The rise in deepfake audio technology has significant threats in various domains, such as personal privacy, political manipulation, and national security. To address these risks, Toronto-based Klick Health, through its research arm Klick Labs, has developed a biometric method to distinguish between audio clips voiced by humans and those generated by artificial intelligence. The approach involves analyzing vocal biomarkers, which are features present in voices that can reveal information about the speaker’s health or physiology.
The team claims to have identified 12,000 biomarkers but currently relies on five key features, including the length and variation of speech, the rates of micropauses and macropauses, and the overall proportion of time spent speaking versus pausing.
“Our findings highlight the potential to use vocal biomarkers as a novel approach to flagging deepfakes because they lack the telltale signs of life inherent in authentic content,” says Yan Fossat, senior vice president of Klick Labs and principal investigator of the study.
The research team, led by Yan Fossat, conducted a study involving 49 participants from diverse backgrounds, each with varying accents. Deepfake models were trained on these collected voice samples to create synthetic audio, which were then analyzed on speech pause patterns.
The findings revealed that machine learning models were able to differentiate between authentic and deepfake audio with an accuracy of around 80 percent.
Earlier this year, Pindrop Security partnered with voice cloning firm Respeecher to promote the ethical use of generative AI. Pindrop’s biometric technology analyzes each audio stream to verify if it originates from a real human voice. The company asserts that its software can detect synthetic voices with over 99 percent accuracy.
Article Topics
biometrics | deepfake detection | deepfakes | Klick Health | synthetic voice | voice analysis
Comments