Circular AI firing squad: Fighting AI bias with fake faces may pose real privacy risks
The problems with fake faces are so complex that some solutions just create more problems.
Researchers in Switzerland, for example, think they have come up with a way to get around training datasets that underrepresent non-white cis males in facial recognition tools.
Idiap Research Institute scientists are going to create synthetic faces representing the rest of the population to train biometric algorithms.
The researchers will supervise the so-called Safer Project with counterparts from the University of Zurich and Sicpa, a Swiss firm creating security links for sensitive documents including currencies. The company will evaluate what the school and Idiap develop.
Biased datasets have become the third rail of facial recognition tool construction. They are an easy target for politicians and regulators, and, according to Idiap, technology companies are recoiling from the possibility of being tied to anything labeled biased.
(The organization has swung at biometrics fairness before.)
Idiap does not mention if there might be a copy-of-a-copy problem in this solution. By definition, synthetic faces are not perfect. Will imperfections in the dataset eventually result in unexpected and significant aberrations among populations that developers had hoped to help?
Then there is a pre-press research paper, titled ‘This Person (Probably) Exists,’ which questions one of the fundamental assumptions of black-box neural networks — that there is no way to know the information that flows into them.
Facial datasets are protected, and thus, so is the privacy of the owners of those faces, according to this principle.
But scientists from the University of Caen Normandie and ENSI Caen say they have found that generative adversarial networks “leak information about their training data” through a new, experimental membership attack.
The attack enabled researchers to “discern samples sharing the same identity as training samples without being the same samples.” It worked when used with multiple popular face datasets and network training procedures.
An MIT Technology Review article analyzing the paper noted that a person’s medical records used to train a diseased-focused model could put a light on that person thanks to a fake face that closely resembles the patient.
This, of course, means that in theory, Idiap’s idea — creating faces to increase model inclusivity — could end up outing the real subjects used to create synthetic ones for face datasets.