FB pixel

MS Celeb and other facial biometrics datasets taken down


Several public facial recognition data sets have been deleted, including a Microsoft database of 10 million faces which is reported to have been the largest dataset in the world for biometric research and training in the world, the Financial Times reports.

The MS Celeb database was published in 2016, and has been used by a wide range of facial recognition researchers, including from militaries and high-profile biometrics companies like SenseTime and Megvii. Images of nearly 100,000 individuals scraped from the internet using search engines and videos under Creative Commons license terms, but consent was not sought from the individuals pictured.

“The site was intended for academic purposes,” Microsoft said in a statement. “It was run by an employee that is no longer with Microsoft and has since been removed.”

Data sets hosted by Stanford and Duke Universities have also been taken down, according to FT, which reported on them and the Microsoft dataset in April. The Duke MTMC surveillance data set, and Stanford’s Brainwash dataset, taken from a livestreaming camera in a San Francisco café, have both been taken offline. Duke did not respond to FT’s request for comment, while Stanford said one of the authors of a study Brainwash was used for requested the dataset’s removal.

The Megapixels project by researcher Adam Harvey documented all three datasets, along with the UnConstrained College Students (UCCS) dataset taken at the University of Colorado, and Oxford Town Centre Dataset. The UCCS dataset has been temporarily taken down because metadata was exposed in the FT article, while the Town Centre dataset remains active, according to the site. Harvey says Microsoft exploited the notion of celebrity, and included people who were vocal opponents of the technology’s development in its dataset.

The professor who made the UCCS dataset available says that he waited five years from when the images were collected to protect the privacy of those pictured, but has faced criticism from a University of Denver law professor, the Denver Post reports.

Use of the MS Celeb dataset has been cited in research papers by numerous facial recognition companies, including Microsoft itself.

“It’s indicative of Microsoft’s inability to hold their own researchers to integrity and probity that this was not torpedoed before it left the building,” technology writer Adam Greenfield, who was included in the MS Celeb dataset, told FT. “To me, it is indicative of a profound misunderstanding of what privacy is.”

Microsoft may also have violated GDPR by leaving the dataset up after the privacy regulation went into effect, FT reports.

Article Topics

 |   |   | 

Latest Biometrics News


Indonesia’s President launches platform to drive digital ID and service integration

In a bid to accelerate digital transformation in Indonesia, President Joko Widodo launched the Indonesian government’s new technology platform, INA…


MFA and passwordless authentication effective against growing identity threats

A new identity security trends report from the Identity Defined Security Alliance (IDSA) highlights the challenges companies continue to face…


Zighra behavioral biometrics contracted for Canadian government cybersecurity testing

Zighra has won a contract with Shared Services Canada (SSC) to protect digital identities with threat detection and Zero Trust…


Klick Labs develops deepfake detection method focusing on vocal biomarkers

The rise in deepfake audio technology has significant threats in various domains, such as personal privacy, political manipulation, and national…


Ford Motor patent filing for facial recognition vehicle entry system published

A patent filing from the Ford Motor Company for a facial recognition vehicle entry system has been published by the…


Real ID requirement finally set to take effect on May 7, 2025

Real ID is about to get real. As of May 2025, adults will no longer be able to use traditional…


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Read This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events