Algorithmic schmutz hurts detection of male face more than female
A research paper looking at how well the three best-known biometric face detection algorithms are likely to work outside the lab found both intuitive and disappointing results.
Three University of Maryland scientists say Google, Microsoft and Amazon software had a harder time detecting — not recognizing — intentionally corrupted faces in large image datasets.
While not surprising to many, despite industry boosterism of AI capabilities, some kinds of faces were more easily detected in the biometrics research than others. Indeed, it appears that masculine-presenting faces were more readily hidden from algorithmic detection.
The researchers say they have developed the first ever detail benchmark for how robust Amazon’s Rekognition, Microsoft’s Azure and Google’s Cloud Platform are in real-world situations.
Images from four datasets, including Adience, UTKFace, MIAP and CCD, were marred by 15 algorithmically generated corruptions. The imposed defects included pixelation, motion blur, Gaussian noise, fog, frost and JPEG compression.
Well-lit images of feminine-presenting subjects with lighter skin types faired best in face detection. Those of older masculine-presenting subjects with darker skin were recognized least often.
The researchers did not report on what caused the errors. There is no examination of the vendors’ robustness in the face of adversarial attacks or varied camera capabilities. Nor do they look into how algorithms were trained.
Generally, decisions on images of masculine-presented subjects in the MIAP dataset were 20 percent more likely to be erroneous than those involving feminine-presenting subjects. The UTKFace dataset produced the best results involving gender, according to the paper, with statistically insignificant differences between masculine and feminine.
Overall, images of the oldest two demographic groups were 25 percent more likely to be erroneously detected using than the youngest two groups in the Adience dataset.
And consistent with many biometrics test results, the paper found that images with lighter-skin subjects (based on the controversial Fitzpatrick scale) had a mean relative corruption error rate of 8.5 percent and the error rate for darker-skin was 9.7 percent.