Cutting out clustered data points to reduce facial recognition bias explained in AFRL talk
Despite the presence of many “heavy hitters” in the public discourse about bias in facial recognition, misunderstanding about the nature of the problem and where it comes from is widespread, according to a recent talk as part of the Applied Face Recognition Lab’s virtual talk series.
John Howard, principal data scientist of the Identity and Data Sciences Laboratory at the Maryland Test Facility, delved into the issue in a presentation titled ‘Understanding and Mitigating Bias in Human & Machine Face Recognition.’
While many observers, including many working in computer science and machine vision, emphasize the role of data as a cause of bias in biometric algorithm performance, Howard notes that there are many possible sources.
“I also think just blaming the data is, frankly, a way to dodge what are probably more challenging and more interesting issues,” Howard explains. This is a tendency that is attractive, because it leads to a resolution that data scientists are used to and comfortable with; the ingestion of more data.
Loss function, evaluation bias, and the way that people relate to machines are important to a more complete understanding of the issue of bias in facial recognition, Howard argues. The latter issue includes projection bias, confirmation bias, and automation bias. In other words, people tend to expect machines to behave like them, confirm their beliefs, and produce results that do not need to be verified.
Face is a newer biometric modality than fingerprint and iris, Howard, says, and lessons can possibly be taken from the two older modalities of the “big three.” “Unique problems” may be presented by elements unique to the face modality, however.
The false matches produced by iris recognition algorithms, for instance, often cross between genders and ethnicities, while those in face do not. This makes it harder for people to spot errors in face matching, despite the same terminology (“false match error”) being used in each case.
Howard reviewed several research papers demonstrating how different biases. Automation bias is modest, and in ideal circumstances shows up mostly when people are unsure, for instance. When circumstances are less ideal, like when people are wearing masks, people are more likely to privilege a computer’s assessment.
Ultimately, while faces do contain similar or ‘clustered’ data based on demographics, Howard emphasizes that research indicates it is possible to select particular data points that do not exhibit clustering, to reduce the false matches errors that amount to bias in face biometrics, particularly when a human is in the loop. This is because the algorithms return candidate lists that suddenly look more like those in fingerprint and iris recognition. The right candidate, in many cases, is obvious to the human eye.