FB pixel

Big jump in public face biometric dataset size

Big jump in public face biometric dataset size
 

A large team of researchers overwhelmingly from China says it has created new million-scale facial recognition benchmark. They claim in a new paper to have built an autonomously cleaned biometric dataset of 2 million identities among 42 million facial images.

The uncurated dataset holds 4 million celebrity identities among 260 million images. The new proposed benchmark is called WebFace260M, and it is being described as the largest public face biometric dataset.

That is a significant differentiator. Public researchers have decried the disadvantage they are at with dataset resources compared to private companies – particularly Facebook and Google. For all intents and purposes, both have unlimited image datasets.

The research paper says Google taps 200 million images of 8 million identities when training FaceNet. Facebook has 500 million faces among 10 million identities.

Dataset size is a potent accelerator of biometrics innovation, and public researchers are worried about being shut out of the race.

The WebFace260M researchers, from Tsinghua University, Imperial College London and a Chinese startup, XForwardAI, claim that their dataset “shows enormous potential on standard, masked and unbiased face recognition scenarios.” It was cleaned with an AI tool they developed, Cleaning Automatically by Self-Training.

Jack Clark, co-founder of AI safety and research firm Anthropic, writing in his blog Import AI, says, “Models trained on the resulting dataset are pretty good.”

Clark also makes the point that facial recognition – especially masked facial recognition – is important to government surveillance agencies. Results like those of WebFace260M influence decisions about “how to surveil a population and how much budget to set aside for said surveillance.”

A dataset this size has more proximate dangers, of course. With great volumes could come privacy-restricted images, long a problem for datasets created by academics and businesses alike.

A site has been posted with project history and updated details.

Article Topics

 |   |   | 

Latest Biometrics News

 

Stop treating identity as a compliance step. It’s infrastructure now

By Harry Varatharasan, Chief Product Officer, ComplyCube The UK governmentʼs digital identity consultation is closing, and for most commentators, this…

 

If you build it, they will leave: experts warn UK gov’t on digital ID approach

The UK Cabinet Office’s consultation on digital identity closed on Tuesday, Digital systems built by governments tend to decline over…

 

Shufti biometric PAD clears iBeta Level 3 with 0 errors across iOS, Android

London-based global identity verification and fraud prevention provider Shufti has passed a Level 3 evaluation of its biometric Presentation Attack…

 

OpenID draft spec for extended identity claims assurance up for approval

Voting is open for approval of a draft specification to extend OpenID Connect to cover new features for requesting and…

 

EES troubles ignite speculation of further suspensions

Crowds, chaos and cranky travelers: The EU’s biometric border management scheme, the Entry-Exit System (EES), continues to fill headlines as…

 

UK Home Office eyes suppliers for SCBP biometrics platform

The Home Office is hosting a preliminary market engagement event to engage with potential suppliers for two not-yet-guaranteed future procurements…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events