FB pixel

New Microsoft benchmark for evaluating deepfake detection prioritizes breadth

Open source dataset project sees collaboration with Northwestern, WITNESS
New Microsoft benchmark for evaluating deepfake detection prioritizes breadth
 

Juan M. Lavista Ferres, corporate vice president and chief data scientist at Microsoft’s AI for Good Lab, has announced the release of a “large-scale, open-source benchmark for evaluating deepfake and manipulated media detection systems.”

Writing on LinkedIn, he says the initiative is a collaborative effort between the lab, Northwestern University’s Security and AI Lab, and tech-focused human rights nonprofit WITNESS. In Northwestern University’s words, it is “intended to help evaluate and improve algorithms to detect AI-generated audio, video, and image content.”

Lavista Ferres says it “introduces a rigorously curated dataset designed to support robust, real-world evaluation of multimodal detection tools,” intended to provide a shared foundation for empirical comparison of detection methods.

It is only to be licensed for evaluation, and is not intended for training or commercial purposes.

The dataset includes more than 50,000 samples of real, AI-generated and manipulated audio-visual content – deepfakes and synthetic media – annotated with data from real-world use cases. Adversarial attacks allow for the testing of model robustness.

Lavista Ferres says the benchmark is intended to support research in multimodal forensics, adversarial robustness and detection in real-world media ecosystems, and invites the research community to “explore the dataset and help maintain its relevance by contributing new data and evaluation protocols over time.”

Northwestern offers more background on the project, and how it is driven by advances in generative technologies. “In the past few years, a new paradigm has emerged with the diffusion architecture, showing impressive achievements in audio, image and video generation,” it says.  “Previous approaches to detection are now obsolete and the detection scene must re-invent itself.”

The summary from Northwestern notes that, historically, the evaluation of deepfake models was based on large datasets opened up during deepfake detection challenges. “These datasets typically had a lot of depth but almost no breadth. They were suitable for the previous era (the GAN era) but are not up to the challenge brought by the new generative AI landscape and the evolving type of harm it brings: scams, non-consensual intimate image generation, disinformation, etc.”

“We argue that depth is less important than breadth and we propose the creation of an evaluation set that contains small samples of as many generators and ‘in the wild’ cases as possible – rather than millions of samples from a few generators.”

Related Posts

Article Topics

 |   |   |   |   | 

Latest Biometrics News

 

UIDAI tightens rules on Baal Aadhaar, updates biometrics for 10M schoolchildren

India is tightening up its signature Aadhaar national ID, making sure duplicates and any redundant looseness that remains in the…

 

NCCoE seeks input on project to apply identity, authorization standards to agentic AI

The National Cybersecurity Center of Excellence (NCCoE) at the National Institute of Standards and Technology (NIST) is interested in AI…

 

DIACC targets digital credential access for 90% of Canadians by 2031

Canada is ready “to build a world-leading digital trust infrastructure” including interoperable digital identity, according to a forward-looking paper from…

 

Smart Eye buys competitor Sightic to expand its driver monitoring system

Smart Eye, the biometric driver monitoring systems (DMS) supplier for auto makers such as Volvo, Nissan and BMW, is acquiring…

 

UNICEF in search of firm to co-design youth digital credentialing system

The United Nations Children’s Fund (UNICEF), under its Generation Unlimited (GenU) initiative, is looking for a company to create and…

 

South Korea prepares for more digital wallets thanks to won-backed stablecoins

As South Korea’s quest to legalize won-denominated stablecoins enters its final stages, the market is preparing new digital wallets that…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events