Researchers say NIST biometric datasets include images of vulnerable individuals taken without consent
The U.S. government, researchers, and businesses have collected facial images from immigrants, abused children, and the deceased to test facial recognition systems without consent, Slate reports in an article alleging the National Institute of Standards and Technology (NIST) may be the worst offender.
Researchers Nikki Stevens, Jacqueline Wernimont, and OS Keyes say their research will be reviewed for publication this summer, and that a combination of public documents and materials gathered through the Freedom of Information Act shows that NIST’s Facial Recognition Vendor Test uses images of exploited children, U.S. visa applicants from Mexico and elsewhere, individuals who have been arrested, and individuals who have died. Images are also drawn from Department of Homeland Security databases for travelers boarding aircraft and for individuals suspected of criminal activity. The various datasets contain millions of pictures of people, combined, and NIST releases some of the datasets for public use to enable individuals or businesses to download, store, and use them.
“The data used in the FRVT program is collected by other government agencies per their respective missions,” NIST Director of Media Relations Jennifer Huergo said when the organization was asked to comment by the researchers. “In one case, at the Department of Homeland Security (DHS), NIST’s testing program was used to evaluate facial recognition algorithm capabilities for potential use in DHS child exploitation investigations. The facial data used for this work is kept at DHS and none of that data has ever been transferred to NIST. NIST has used datasets from other agencies in accordance with Human Subject Protection review and applicable regulations.”
The researchers presume that the one instance referred to is the CHEXIA Face Recognition Challenge 2016, jointly held by NIST and DHS. CHEXIA has since evolved into a DHS program for investigating child pornography on the dark web.
An operational dataset created for the initial CHEXIA program has also been included among datasets used by NIST FRVT since July 31, 2017, according to the researchers. The same dataset is also included in the most recent report from June 21, 2018, they say.
The Multiple Encounter Dataset has been used since 2010, and features mug shots taken before trials have been conducted, along with images of deceased people, supplied by the FBI. This dataset contains 47.5 percent photographs of black people, who make up only 12.6 percent of the U.S. population, according to the researchers.
The researchers suggest that a greater focus is needed on increasing regulation, rather than diversity in datasets, and that NIST is not fit for the purpose of developing regulations for facial recognition, as it has been recently suggested in Congress it should. Instead, ethicists and advocates for immigrants and child welfare, as well as other marginalized populations, should write the policies, the researchers suggest.
Keyes recently released research showing that gender recognition systems are typically binary and consider gender as exclusively physiological, which may perpetuate bias against transgender people.