Datasets behind face biometrics reveal fields’ ethical backslide: research paper
Face biometrics researchers have largely abandoned a set of data collection practices in order to fuel the voracious appetite of AI systems, losing site of consent and data quality considerations in the process, according to research published by Inioluwa Deborah Raji and Genevieve Fried.
Their ‘About Face: A Survey of Facial Recognition Evaluation’, which was presented at the AAAI 2020 Workshop on AI Evaluation, looks at more than 130 datasets used to train face biometric algorithms, compiled over a 43-year period to 2019.
The databases considered include approximately 145 million images of 17 million different people, and are shaped by changes in political environment, technological capability, and social norms, Raji and Fried find.
Decrying that “a string of failed real world pilots contradicts the academic mythos of facial recognition as a solved problem,” the researchers
The paper reviews the historical development of facial recognition, from the early research conducted from 1964 to 1995, through a period of exploration as “the ‘New Biometric’” up until 2006, then mainstream development, and finally a current period the researchers characterize as the “Deep Learning Breakthrough.”
The vast majority of early datasets were compiled by conducting photoshoots, with full consent of participants, like the Face Recognition Technology (FERET) database released in 1996. The Labeled Faces in the Wild (LFW) dataset, released in 2007, marked the beginning of large-scale web-scraping, building up troves of images without subject consent, and including more minors than ever before, according to the paper.
The researchers find biometric datasets with subject consent fell from over 86 percent of all those available in the ‘New Biometric’ era to only 8.7 percent in the current, fourth period, with web searches taking on the bulk of the image collection work. The scale of databases made it nearly impossible to manually verify and label all images with the development of the DeepFace dataset by Facebook in 2014. This dataset ushered in the deep learning era of face biometrics, as well as phenomenon like automatically generated labels with offensive terminology.
The result is a field of research and development that is now reliant on large-scale data privacy violations, according to Raji and Fried.
“Was it worth abandoning all of these practices in order to do deep learning?” Raji asks Technology Review.
“Dataset evaluation is acritical juncture at which we can provide transparency and even accountability over facial recognition systems, and interrogate the ethics of a given dataset towards producing more responsible machine learning development,” the researchers conclude.
Civil society groups see face biometrics use as widespread and indiscriminate
International Network of Civil Liberties Organizations (INCLO), meanwhile, has published a report on harmful outcomes from face biometrics trials and operations.
The ‘In Focus: Facial Recognition Tech Stories and Rights Harms from Around the World’ report considers negative impacts of facial recognition technology on rights to freedom of expression, equal treatment and non-discrimination, freedom of peaceful assembly and association, and privacy. It does so by reference to examples of the worst behavior the organizations could find from the UK, U.S., the West Bank and Occupied Palestinian Territories, South Africa, Russia, Columbia, Canada, Argentina, Ireland, Hungary, India, Australia and Kenya.
Collectively, these examples show “how this harmful surveillance has become pervasive and entrenched in private and public spheres across the world,” according to the report. Democratic debate and robust protections are the necessary response, it argues.
Elsewhere, the report claims the report illustrates that indiscriminate use of facial recognition is widespread among law enforcement and government agencies around the world, and that it is dangerously normalizing surveillance. This is due to the real-time tracking of individuals, though many of the systems described in the report are forensic systems.
The lack of regulation in many countries is identified as an urgent concern, but ultimately the groups believe the technology is inherently destructive to individual rights, as it contradicts the presumption of innocence identified as an international human right by the UN’s Universal Declaration of Human Rights.