FB pixel

Everything you need to know about deepfake detection – for now

EAB workshop peels back layers of AI deepfake landscape that pits fraud against defense
Everything you need to know about deepfake detection – for now
 

Deepfake technology is advancing at a rapid pace. Our laws and social practices are not keeping up. Diverse stakeholders, from regulators to artists, bring a broad spectrum of concerns to the table. Public awareness is still relatively low, and more education is needed. The terminology at play is inefficient and outdated. Our collective weakness at recognizing fake media and employing critical thinking can lead to life changing consequences for those who are victimized. Meanwhile, potential threats to election results can seem overblown – but as fake content is able to spread at speed through social networks, it can erode the overall social trust fabric and throw truth into a chaos of uncertainty.

What can we do? At a recent deepfake workshop hosted by the European Association for Biometrics (EAB), a group of experts weighs in on how legislation, regulation, education and technology can be combined for effective protection against the emerging deepfake threat. The overarching question is of the utmost gravity: when it comes down to it, what is reality worth to us?

Deepfake generation and detection in cyclical ‘rat race’: Peter Eisert  

For Peter Eisert, head of Vision & Imaging Technologies and department chair of Visual Computing at Humboldt University, the generation and detection of deepfakes are caught in a kind of repeating loop. The proliferation of deepfake tools that are easy to access has led to continual refinement. Eisert uses the image of a hamster wheel to symbolize the ongoing effort to develop AI that is effective at detecting these increasingly sophisticated deepfakes.

“We have seen a tremendous increase in the quality of the deepfake,” Eisert says. In the rat race between fraud-purposed AI and defensive AI, detection strategies must evolve in tandem with attack patterns.

Eisert outlines various types of current face deepfakes. Face swaps put someone’s face on someone else’s body. Face reenactment manipulates facial expressions or pose. Fully synthetic deepfakes can be created with a GAN or diffusion model. Methods continue to evolve: Eisert points to “Gaussian Splatting” – a new technique that yields high-resolution facial deepfakes. Each new iteration exhibits different types of artifacts that appear in different places and in different contexts.

If deepfake detection is to keep pace with generation, it must evolve in time. Many current deepfake detectors are frame-by-frame based. But Eisert says they should be looking at temporal effects and “inconsistency over time” – semantic and temporal information in the content, such as heart rate or temporal flow of blood in the face, for instance – as potential indicators.

A diverse data set is also important. If data sets feature too many easy-to-spot artifacts then AI models will easily jump on these artifacts, and won’t generalize well. “We need high quality deepfake data to train robust detectors,” Eisert says, to “make it more difficult for new attackers to find the holes in your latent space that are not covered by your detector.”

Training strategies such as AltFreezing or Real Forensics offer different ways into the problem of how to train detectors to pay attention to temporal features. There are further possibilities: facial expression parameters and avatar fingerprinting – a user ID technology developed by NVIDIA – which uses facial dynamics for identity verification.

The key takeaways from Eisert are that temporal consistency is still not fully exploited for detection, and high quality data sets with a broad variety of data are needed for training.

‘It’s not easy for the human eye to detect deepfakes’: Ann-Kathrin Freiberg

BioID’s Ann-Kathrin Freiberg, who recently did an EAB lunch talk about deepfakes, brings an industry perspective to the workshop. She emphasizes many of the same points from the previous talk: deepfakes are used for criminal activities such as phishing, CEO video call scams, romance scams and – in the majority – pornography, with which more than 90 percent of deepfake content is associated.

Deepfake tech is also a threat to democracy. Deepfake disinformation campaigns resulting in election interference can spread quickly, with content circulating and garnering thousands of views within hours. Even if a video has been manipulated, says Freiberg, “lots of people have already seen it.”

Unfortunately, people are not innately well equipped to detect deepfake media; the human eye still does a pretty poor job, and it will only become more difficult as the technology to create deepfakes improves. Furthermore, some appear not to mind if certain public figures are not real. Freiberg points to the Instagram sensation Aitana Lopez, a deepfake model (or “virtual soul”) who has 329,000 followers and brings in thousands of euros a month for the Spanish agency that created her. “People don’t really care that she’s AI-generated,” Freiberg says.

To curb both individual threats like identity theft and reputational damage, and societal threats like election meddling, deepfake detection methods must stay up-to-date. It is AI versus AI in the deepfake arena, but the matchup is not merely gladiatorial – and not to be won solely by algorithm. Freiberg believes the key to effective deepfake detection is a holistic approach that combines media literacy (an awareness of what we share and how) with regulation such as the EU AI Act, and technical support solutions such as watermarking and camera source analysis.

‘The circle is closing here’: Mateusz Labuz

Detecting deepfakes means first defining them. So says Mateusz Łabuz of the Chemnitz University of Technology. Labuz distinguishes between the technical, typological, effectual and subjective aspects of deepfakes. All of these figure into the holistic approach needed to fight them.

Labuz’ talk leans away from “technical fixation” to take in the entire security ecosystem. Technology needs the right framing, he says. That means understanding not just the scale of the problem, but also the nature of it.

For instance, the issue of deepfake porn is an issue of women’s safety, since fully 99 percent of deepfake porn is of women. Women who have been victims of deepfakes suffer long-term physical and psychological consequences.

Correlate that with the fact that 98 percent of deepfake video is related to porn, and you have deepfakes as a major threat specifically to women.

On the flip side, there’s still room for a variety of uses for deepfakes, and an appropriate response. And the effects of even minimal interference from deepfake attacks can have consequences that reverberate across the security ecosystem, from fueling social and political tensions to undermining trust in the media and information.

Regulations, says Labuz, often fall short of adequate measures to protect users. He also discusses protections such as watermarking, biometric hashing and other techniques that can be used to authenticate media. Countermeasures must go beyond the legal plane, and investment and human tools and capital are necessary. Technology must be matched with awareness, regulations, resilience and ways to dampen amplification. Effective enforcement must be demonstrable.

Labuz says “the circle is closing here.” Society must move faster to make sure AI-generated deepfake content is always clearly disclosed, adjust legal systems to reflect new threats and needs (such as helping victims of deepfake porn), and generally strengthen social resilience and public awareness. “Do victims of deepfakes actually know what they can do in that situation? I see huge weaknesses in how to react at critical moments.”

In the end, Labuz rests his holistic approach on three pillars, which validate many of the statements by Eisert and Freiberg. Regulation and legal measures, technology and detection systems and increasing social resilience are the ways out of our deepfake dilemma. It will require dialogue, flexibility and breadth to find solutions.

“It is not about hindering innovation,” he says. “It’s about restoring the basic trust in technology in our society.”

‘Design, detection and then explanation’: Gian Luca Marcialis

Gian Luca Marcialis runs the sAIfer Lab’s Biometric Unit. The lab is working on approaches to passive deepfake detection, and on a deepfake detection taxonomy which classes a number of detection approaches: general network/undirected detection, visual-artifacts-based, temporal-consistency based, biological signals and camera/GAN fingerprints. (In other words, extra fingers, trouble blinking, heart rate, and digital noise.) Deep learning and convolutional neural networks are another option to amplify detection.

Marcialis’ sAIfer Lab is engaged in ongoing work to map the technical processes that underpin deepfake detection methods. It reflects the larger community’s effort to not just detect deepfakes, but to define and distinguish what they are.

‘You can’t see or touch the fakeness of audio’: Jennifer Williams

“Speech and audio processing used to require an entire phD in electrical engineering. Now, it can be anyone who creates these deepfakes.” So says Jennifer Williams, assistant professor at University of Southampton, covering the aural angle of the deepfake debacle. Audio deepfakes exploit bias in human hearing, which lets sound “play tricks on us.” Speech is biometric data that represents a unique identity. If it is compromised, the results can range from financial fraud to a complete breakdown of the capacity to believe what you hear.

“For some people,” Williams says, “it’s really a problem of reality itself.”

Voice synthesis technology has progressed quickly. In 2016, the cutting-edge neural SPSS called Merlin still sounded like a stereotypical robot. Now we have deepfake AI robocalls and virtual public affairs officials making statements on behalf of national governments. Positive use cases haven driven seismic change in voice technology, which is then inevitably exploited for criminal ends.

Willims says audio deepfake attacks come in a variety of forms:

Record-and-replay attacks are where “the person speaking is in fact the correct person but the speech has been captured and replayed so that the words might be out of context.” It can be achieved with a simple source recording of a person’s voice. Recordings can be spliced into other snippets of real or synthetic speech to change context. (Think about what could result from a fraudster having nothing more than a recording of you saying “yes.”)

Voice conversion “requires a source speaker to say exactly the right words.” It can then convert the voice of the source speaker into that of any number of target speakers. The content of the speech stays the same but the voice changes.

Text-to-speech synthesis “puts words in someone’s mouth that they’ve never said before” via speaker embedding. This requires sophisticated machine learning and significant amounts of training data, and allows control of things like pitch, prosody and emotional timbre.

Partial deepfakes are where you take small, simple edits to mix real and fake media. This combines multiple types of editing and machine learning capabilities.

Artifacts for deepfake audio are most easily detected in higher frequencies, often outside the range of human hearing. They can occur as inaudible pops, buzzing noises from sporadic phase mismatch, and in problems with annunciation or the cadence of breath. They are not consistently distributed in time.

Legally, there are socio-technical challenges in the justice system, and disagreements over many of the terms used to discuss audio deepfakes – terms like “synthetic” and “authentic.” A shifting technical landscape pays no heed to the slower adaptations of regulators.

Williams and her team at the University of Southampton have built a system for deepfake detection called SAFE and Sound (the Southampton Audio Forensic Evaluator), which requires only one second of audio to provide highly accurate and explainable model decisions. It incorporates a human perception model, a vocal tract model, an emotive speech model, noise robustness, the acoustic environment and a high frequency anomaly detector. Design-wise, it aims to be fast, easy to use, scalable for large amounts of data, and applicable across use-cases.

‘Now is a key moment; Chat GPT is only two years old’’: Benoit Fauve

Benoit Fauve of voice biometrics firm ValidSoft also looks at some of the larger considerations around audio deepfakes. He traces the evolution of audio deepfakes from early spoofing methods through the beginning of the deepfake era and voice cloning in the mid-2010s. The term “deepfake” was coined on Reddit in 2017. Since then, it has been a steady upward curve of development and investment in deepfakes and how to fight them.

The current moment, in which generative AI has enabled an audio deepfake boom, is key, Fauve says. But, like Williams, he emphasizes that some of the language must be untangled to achieve the strongest defensive pose, and that literacy about deepfakes is a key part of the equation.

‘We are here to build an ethical, safe system’: Luke Arrigoni

Luke Arrigoni, who founded digital likeness protection technology startup Loti AI, focuses his business on online licensing and protection for public figures. He says Loti Watchtower combines facial recognition defense (takedowns of unauthorized deepfakes, etc.) and offense – wherein, for instance, contracts for AI representations of talent to appear in advertisements are designed to the advantage of the talent.

But if there is an overarching message from the workshop, it is that deepfakes will soon affect everyone, if they don’t already. The technology is progressing, and detection methods are scrambling to keep up. The hamster wheel keeps rolling; the deepfake rat race continues.

Related Posts

Article Topics

 |   |   |   |   |   |   |   |   |   |   | 

Latest Biometrics News

 

NADRA and NIRA work to advance Somalia’s digital identification program

Pakistan’s National Database and Registration Authority (NADRA) remains committed to helping Somalia reach new milestones in its national ID card…

 

Moldova plans distribution of biometric capture devices to its diplomatic missions

The Moldovan government has decided to facilitate the process of issuing passports and digital ID cards for its citizens abroad….

 

Romania finalizes formalities for digital ID, issuance begins March 20

Romania will begin issuing its new Electronic Identity Card (CEI) on Thursday March 20, one week after the government concluded…

 

As Trump’s AI deregulation, job cuts sink in, industry gets spooked

In January 2025, President Donald Trump issued Executive Order (EO) 14179, Removing Barriers to American Leadership in Artificial Intelligence. It…

 

Effective digital public services need strong ID tech foundation: Entrust

Digital public services are increasing their efficiency, as well as accessibility, which in turn increases inclusivity. Delivering them to people…

 

UK cybersecurity sector sees rise in 2024

The UK’s cyber security industry – which includes digital identification, authentication and access controls firms – has generated £13.2 billion…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events