Google patents method of matching voices to speakers’ faces in video

Jul 5, 2018, 12:05 pm EDT | Chris Burt

Categories Biometric R&D | Biometrics News

A patent filed by Google for an automated method of matching faces to voices in videos has been published by the World Intellectual Property Organization.

The patent, which was originally filed in April of last year, describes a computer-implemented method for speech diarization, in which a convolutional neural network is used to recognize faces, and a machine learning model is applied to segments of speech to detect different speakers. Wikipedia describes speaker diarization as a process of partitioning an audio input stream into homogenous segments according to speaker identity.

“The content system detects speech sounds in the audio track of the video, and clusters these speech sounds by individual distinct voice,” inventors Sourish Chaudhuri and Kenneth Hoover write in the application. “The content system further identifies faces in the video, and clusters these faces by individual distinct faces. The content system correlates the identified voices and faces to match each voice to each face. By correlating voices with faces, the content system is able to provide captions that accurately represent on-screen and off-screen speakers.”

Google researchers also published a paper earlier this year detailing an audio-visual method for using AI to separate speech from different individuals, mimicking the “cocktail party effect.”

Article Topics

biometrics | facial recognition | Google | patent | voice recognition

Google patents method of matching voices to speakers’ faces in video

Article Topics

Comments

Leave a ReplyCancel reply

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events

Google patents method of matching voices to speakers’ faces in video

Article Topics

Latest Biometrics News

Hawaii ID issue shows interoperability matters as digital IDs scale

State Department moves to buy Clearview AI licenses for Colombia police

Meta licensed ROC facial recognition, liveness for smart glasses project

UK aims to lead the world with new age restrictions for social media, AI chatbots

Germany moves to allow police facial recognition searches of online images

US senators propose curbs on AI-generated election deception

Comments

Leave a ReplyCancel reply

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events