Researchers develop “soft biometrics” from facial expressions to detect deepfakes
Researchers at the University of California Berkeley and University of Southern California have developed a way to detect deepfake video by forensically detecting subtle characteristics of individual’s speech, and examining video to see if the characteristics are present.
The research paper “Protecting World Leaders Against Deep Fakes” was authored by UC Berkeley Computer Science Graduate Student Shruti Agarwal, with her thesis advisor Hany Farid and a team from USC and the USC Institute for Creative Technologies, and published by the Computer Vision Foundation. The technique, which was found to determine whether videos were fake or real with accuracy between 92 and 96 percent, was presented at the Computer Vision and Pattern Recognition conference in Long Beach, CA, and applies to “face swap” and “lip-sync” deepfake methods, which the USC computer scientists use to create videos for research purposes.
The researchers used the OpenFace2 facial behavior analysis toolkit to detect small facial tics such as raised brows, nose wrinkles, jaw movement, and pressed lips, and then created what the team calls “soft biometric” models for facial expressions with the data. Analyzing video of five major U.S. political figures, the researchers found that each has distinct mannerisms when speaking.
“We showed that the correlations between facial expressions and head movements can be used to distinguish a person from other people as well as deep-fake videos of them,” the report authors write. They also tested the technique against compression, video clip length, and the context of the speech considered. They found it more robust against compression than pixel-based detection techniques, but that if speakers are in different speech contexts – such as an informal setting, rather than a delivery of prepared remarks – detection success is limited. A larger and more diverse set of training videos may mitigate this limitation.