FB pixel

Computer vision modelers take too much for granted. Data sets hold bias surprises

Computer vision modelers take too much for granted. Data sets hold bias surprises
 

A pair of U.S. researchers say unsupervised computer vision models used in biometrics and other applications can learn nasty social biases from the way that people are portrayed on the internet, the source of large numbers of training images.

The scientists say they know this because they created what they say is the first systematic way to detect and quantify social bias — including skin tone — in unsupervised image models. In fact, they claim to have replicated eight of 15 human biases in their experiments.

The research has been posted on a preprint server by Ryan Steed at Carnegie Mellon University and Aylin Caliskan, with George Washington University.

Statistically significant gender, racial, body size and intersectional biases were found in a pair of state-of-the-art image models– iGPT and SimCLRv2– that were pre-trained on ImageNet.

As noted by VentureBeat, ImageNet is a popular image data set scraped from web pages. It also is “problematic,” according to the corporate-finance publisher.

Business magazine Fast Company looked at ImageNet’s 3,000 categories for people and found “bad person,” “wimp,” “drug addict” and the like.

The authors concluded that developers have been lulled into complacency when it comes to training vision models for facial recognition and other tasks because of advances in natural language processing. Garbage data exists in image data sets, and systems are not filtering it or even alerting data scientists and developers to its presence.

The paper warns the community that “pre-trained models may embed all types of harmful human biases from the way people are portrayed in training data.” Choices made in model design “determine whether and how those biases are propagated into harms downstream.”

Article Topics

 |   |   |   |   |   |   | 

Latest Biometrics News

 

Privacy doesn’t have to cost us great online services

By Andrew Black, Managing Director ConnectID and Sujeet Rana, Chief Digital Officer NAB For years, we accepted an implicit trade-off…

 

Alan Turing Institute reveals digital identity and DPI risks in Cyber Threats Observatory Workshop

Digital identity systems are showing growing vulnerabilities with commensurate risks for the development of DPI. The Alan Turing Institute launched…

 

Biometric identity verification gets caught up in great expectations and politics

The next generation of biometric identity verification collides with the politics of digital identity in the most-read articles of the…

 

Todd Morris named NEC NSS President as Dr. Kathleen Kiernan retires

Todd Morris is the new President of NEC National Security Systems (NEC NSS). Morris succeeds Dr. Kathleen Kiernan, who is retiring…

 

ISO’s mDL standard can’t guarantee issuer trustworthiness

The fear that the server retrieval capability supported by the ISO/IEC 18013 standard for mobile driver’s licenses (mDLs) could be…

 

One app, two app, three app, four: DECTA study shows users have ‘wallet fatigue’

While some see the concept of a “15-minute city” as sinister, advocates say they just don’t want to go very…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events