NIST report tackles issue of ‘bias’ in facial biometrics
New research on demographic differentials of biometric facial recognition accuracy, or ‘bias,’ has been published by the U.S. National Institute of Science and Technology (NIST), confirming both a significant difference in the accuracy of some algorithms in matching women and people with darker skin, but also significant improvement over previous research.
The report on ‘Face Recognition Vendor Test (FRVT) Part 3: Demographic Effects’ refers to previous research by Joy Buolamwini and others indicating bias in facial biometrics, but suggests caution should be taken in drawing conclusions from such studies. The report does support concerns that most facial recognition technology has differing levels of accuracy matching different populations, however, which has been used as an argument against the technology’s use by law enforcement.
“While it is usually incorrect to make statements across algorithms, we found empirical evidence for the existence of demographic differentials in the majority of the face recognition algorithms we studied,” comments Patrick Grother, a NIST computer scientist and the report’s primary author. “While we do not explore what might cause these differentials, this data will be valuable to policymakers, developers and end users in thinking about the limitations and appropriate use of these algorithms.”
“There is a wide range of performance and there’s certainly work to be done,” Craig Watson, Image Group Leader at NIST told the Associated Press. “The main message is don’t try to generalize the results across all the technology. Know your use case, the algorithm that’s being used.”
Buolamwini told AP in an email that the study provides a reminder of the “consequential technical limitations” of the technology.
“While some biometric researchers and vendors have attempted to claim algorithmic bias is not an issue or has been overcome, this study provides a comprehensive rebuttal,” according to Buolamwini.
The report examines 189 algorithms from 99 developers, providing false positive and false negative rates for each.
“In a one-to-one search, a false negative might be merely an inconvenience — you can’t get into your phone, but the issue can usually be remediated by a second attempt,” Grother notes. “But a false positive in a one-to-many search puts an incorrect match on a list of candidates that warrant further scrutiny.”
NIST says few studies have previously evaluated demographic effects in one-to-one matching systems, and none have done so for one-to-many systems.
The accuracy of the algorithms tested varies broadly. Other broad findings include higher false positive rates for Asian and African American faces relative to Caucasians, and similarly high rates of false positives for Asians, African Americans, and native groups among U.S.-developed algorithms. For algorithms developed in Asia there was no dramatic difference between Asian and Caucasian faces, suggesting more diverse training data can produce more equitable outcomes. For one-to-many matching, higher rates of false positives were observed when matching African American females, but the most accurate algorithms were also the most equitable, and did not show such high error rates.
A complete consideration of the appropriateness of using facial recognition must also consider the use case, NIST points out.
French data protection authority CNIL has published a contribution to the political discussion around the technology, reviewed by its commissioners. CNIL sets out a technical definition for what facial recognition is and what it is used for to provide clarity, and highlights the need for risk assessments and appropriate safeguards. The agency also reviews the more stringent requirements for facial recognition frameworks mandated through the combination of GDPR and other laws at the EU and national level, and clarifies its own position in advising and enforcing, but not determining, such frameworks.
The Biometrics Institute welcomed the NIST report, saying it shows facial recognition to be an extremely accurate but probabilistic technology. The organization urges its members to be aware of the algorithm they are using, and act responsibly according to the strengths and weaknesses of their biometric tools.
“Biometric technology can be an effective tool to assist in identification and verification in an array of use cases,” says Biometric Institute Chief Executive Isabelle Moeller. “These range from the convenience of using your face to unlock your phone, to getting through passport control quicker, to the reassurance that a face can be found in a crowd far quicker with the assistance of technology than relying on a human alone. However, when we think of the word bias we tend to consider it as a pre-meditated, closed-minded and prejudicial human trait. It’s important to remember that technology cannot behave in this way. So-called bias in biometric systems may exist because the data provided to train the system is not sufficiently diverse. That is why in cases including law enforcement and counter-terrorism the human in the loop – to verify the algorithm’s findings – is often a critical aspect of using the technology.”
Grother will speak about bias and demographic differentials at the institute’s U.S. Congress on March 24, 2020, in Washington, D.C.