Researcher discusses facts related to media narratives about facial recognition ‘bias’
“Biometric performance is nuanced and cannot be reduced to generalizations like ‘bias’,” reads a LinkedIn article by Yevgeniy Sirotin, PhD, Senior Principal Scientist and Manager at SAIC, who argues only meticulous research of biometric systems performed by third-parties through peer review could deliver factual information about facial recognition performance, or other biometric technologies. The ACLU, meanwhile, has published a study to contribute to the media narratives Sirotin is arguing against.
Sirotin’s article comes after a number of media outlets have written about facial recognition ‘bias’ and negative impact, arguing the technology makes errors and does not work effectively for different demographic groups. Sirotin is part of a research group that tests biometric technologies for the Department of Homeland Security Science and Technology Directorate (DHS S&T), and that has looked into the concerns of the public. Their research shows that although facial recognition may not be perfect, “many current media narratives have misleading undertones,” he writes.
Throughout his LinkedIn article, Sirotin takes a look at common media takes on facial recognition bias and presents facts related to each, based on industry research. Based on scientific results from research performed by SAIC Identity and Data Sciences Laboratory at the Maryland Test Facility (MdTF) and others, the statements he analyzes are “face recognition makes errors,” “face recognition is highly accurate for women of color,” “face recognition performance depends on image capture,” “face recognition errors affect all demographic groups.”
Since 2014, the U.S. DHS has been testing biometric technology in different travel scenarios, so far analyzing some 2,000 unique individuals with different demographic profiles, he writes. Four years later in 2018, the ACLU and MIT Media Lab presented their research that claimed facial recognition was biased. According to ACLU results, the technology falsely matched 39 percent of people of color, while another report from MIT Media Lab found it delivered more errors for women of color (47 percent).
The ACLU chapter of Massachusetts announced this week that 27 athletes were falsely matched during tests on Amazon’s Rekognition algorithm to criminals’ mugshots in a police database, according to Business Insider. The organization says it wanted to highlight that facial recognition is prone to error and that law enforcement should not see it as a silver bullet in suspect identification.
Amazon reiterated its claim that the organization has not been using the algorithm properly, because the recommended similarity threshold for law enforcement is 99 percent, instead of the 80 percent used by the ACLU. Using a 99 percent similarity threshold means the image will not be matched unless there is a 99 percent accurate match.
“In real-world public safety and law enforcement scenarios, Amazon Rekognition is almost exclusively used to help narrow the field and allow humans to expeditiously review and consider options using their judgment (and not to make fully autonomous decisions),” reads a statement from Amazon.
A spokesperson further added that “The ACLU is once again knowingly misusing and misrepresenting Amazon Rekognition to make headlines. When used with the recommended 99% confidence threshold and as one part of a human driven decision, facial recognition technology can be used for a long list of beneficial purposes, from assisting in the identification of criminals to helping find missing children to inhibiting human trafficking.”
Duron Harmon, a player for the New England Patriots of the National Football League and one of the 27 athletes falsely matched by Amazon’s facial recognition software, expressed his concern about the use of facial recognition in criminal investigations and the impact it may have in real-life situations.
“This technology is flawed,” he said according to the ACLU. “If it misidentified me, my teammates, and other professional athletes in an experiment, imagine the real-life impact of false matches. This technology should not be used by the government without protections.”
The news comes amidst ACLU’s lobby for a ban on the use of facial recognition by government agencies in Massachusetts.
Sirotin says the 2018 figures presented by ACLU and MIT Media Lab were unexpected, which is what motivated his research group to publish their facial recognition observations in 2019. Discussions around the idea that facial recognition is ‘biased’ are “highly problematic,” he says, because it will affect the public’s trust in biometric technology and could “shut down discussion around a nuanced issue.”
While there are many critics of Amazon’s Rekognition, this is far from the only facial recognition algorithm in the market.
Sirotin explains “bias depends on the full biometric system, and how it is used,” because “an algorithm could have an inherent bias if it had insufficiently diverse training data.”
Currently, a number of cities and states in the U.S. have either banned facial recognition or are considering banning its use especially by law enforcement. Since there are a number of non-profit groups fighting facial recognition such as “Fight for the Future’s” website called “banfacialrecognition.com” that tries to associate it with nuclear and biological weapons, Sirotin notes, it is important to double check the performance figures behind the ‘bias’ to ensure the claims are pertinent.
Sirotin says facial recognition does make errors in some cases, it depends on the type of comparison. “Recognition involves a comparison of at least two images of an individual and makes two types of errors,” he writes. Two images of the same person are compared, and if the images are of two different people, there should not be a match. When the technology says a mismatch is a match, that is a false positive, while failure to match two images of the same individual is a false negative.
A 2018 study by MIT Media Lab called Gender Shades states biometric gender cassification is only 66 percent accurate for women of color. However, Sirotin says, two key points in the research are that only three algorithms were examined and the research analyzed gender classification instead of recognition. The latter is important “(b)ecause classification involves comparing a single image to a class representation learned by a model, not a comparison of two images along a scale learned by a model, as for recognition.”
On the other hand, according to Sirotin’s group’s work, facial recognition is very accurate for women of color. The algorithms tested reported less than 2 percent false negative errors. While not all systems are perfect and some might perform unsuccessfully due to insufficiently diverse training data, Sirotin believes it is critical not to generalize because a small sample could not be representative. “Overall, large-scale testing shows that face recognition can work well for all people, including women of color,” he says.
Image acquisition is another important aspect that could affect facial recognition performance and bias. After using 11 commercial face acquisition systems to acquire 363 images of random people for the 2018 Biometric Technology Rally, the numbers showed one popular commercial algorithm did a poor job for dark-skinned people, but it all depended on the cameras used, and on how and when the images were acquired.
A top argument constantly brought up in the media is that the technology works for white males but fails for black females, Sirotin says. According to his research, however, while there were a couple extra errors for younger black women than for younger white men, the algorithm delivered far more errors for older white men than for older black men. “Overall there was no single demographic factor (old/young, black/white, male/female) that, by itself, clearly explained false positive errors. […] So the media narrative is false, face recognition errors do not only affect a single demographic group,” he concludes.
This post was updated on October 25 at 1:56 Eastern to clarify Sirotin’s concern with discussions around ‘bias.’
Article Topics
accuracy | algorithms | biometric testing | biometric-bias | biometrics | biometrics research | DHS | facial recognition | law enforcement | SAIC
Comments