FB pixel

Computer vision modelers take too much for granted. Data sets hold bias surprises

Computer vision modelers take too much for granted. Data sets hold bias surprises
 

A pair of U.S. researchers say unsupervised computer vision models used in biometrics and other applications can learn nasty social biases from the way that people are portrayed on the internet, the source of large numbers of training images.

The scientists say they know this because they created what they say is the first systematic way to detect and quantify social bias — including skin tone — in unsupervised image models. In fact, they claim to have replicated eight of 15 human biases in their experiments.

The research has been posted on a preprint server by Ryan Steed at Carnegie Mellon University and Aylin Caliskan, with George Washington University.

Statistically significant gender, racial, body size and intersectional biases were found in a pair of state-of-the-art image models– iGPT and SimCLRv2– that were pre-trained on ImageNet.

As noted by VentureBeat, ImageNet is a popular image data set scraped from web pages. It also is “problematic,” according to the corporate-finance publisher.

Business magazine Fast Company looked at ImageNet’s 3,000 categories for people and found “bad person,” “wimp,” “drug addict” and the like.

The authors concluded that developers have been lulled into complacency when it comes to training vision models for facial recognition and other tasks because of advances in natural language processing. Garbage data exists in image data sets, and systems are not filtering it or even alerting data scientists and developers to its presence.

The paper warns the community that “pre-trained models may embed all types of harmful human biases from the way people are portrayed in training data.” Choices made in model design “determine whether and how those biases are propagated into harms downstream.”

Article Topics

 |   |   |   |   |   |   | 

Latest Biometrics News

 

Thomson Reuters and Socure partner on AI-driven fraud prevention

Thomson Reuters is moving deeper into digital identity verification and fraud prevention through a new partnership with Socure, tying together…

 

Keir Starmer’s political crisis casts shadow on UK’s digital ID plans

Last week, the King’s Speech set out 37 bills for the new parliamentary year, including the Digital Access to Services…

 

Biometric Update report analyzes how MOSIP is reshaping digital identity infrastructure

Biometric Update has published a new report examining the growing role of the Modular Open Source Identity Platform (MOSIP) in…

 

Hancomwith joins South Korea’s 2026 Zero Trust pilot with SASE‑based security model

Hancomwith is taking part in the South Korean government’s 2026 Zero Trust Adoption Pilot Project. The initiative is supposed to…

 

Cambodia launches digital driver’s licences, national ID services expand

Cambodia is expanding its digital government drive with the launch of digital driver’s licences, while also stepping up national ID…

 

ID.me and Verisys partnership points to broader CMS digital identity push

ID.me and Verisys have launched a strategic partnership aimed at helping state Medicaid agencies verify provider identities, validate credentials, and…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events