Facial recognition advances spurring new use cases, improvements in IFPC 2022 spotlight
Facial recognition is about to be able to accurately unmask deepfakes, match families walking through the airport all at once, and outperform other biometric modalities, according to industry experts speaking at the International Face Performance Conference (IFPC) 2022.
The first day of the event hosted by NIST focused on facial image quality, its assessment and how to improve it.
The early portion of the second day was dedicated to standards and regulations, with presentations by representatives of the European Commission, Idemia, UNICRI, the Norwegian ID Centre, USG and RAND Corporation.
The second portion of the day’s presentations on the industry’s outlook on face biometrics. Presenters included Paravision Machine Learning Tech Lead and Manager Neda Eskandari, Rank One Computing Chief Scientist, Co-founder and President Brendan Klare, Idemia Chief AI Scientist Stephane Gentric, and representatives of Accenture and Trust Stamp.
Real uses for fake data
Deepfakes and synthetic media actually refer to a range of different fake phenomena, including different types of data, imaging scope and types of manipulation, Eskandari points out. Even image quality varies widely among fakes.
She explored the value of synthetic faces for algorithm training and benchmarking facial recognition models.
Engineers need to be careful with what synthetic faces are chosen for training, the ratio between synthetic and real faces, and be sure to include multiple synthetic faces for each identity.
Of course, synthetic data also presents a threat in several different ways.
Paravision research indicates that facial recognition models trained on real faces tend to be good at understanding “hidden information about synthetic faces.” The company’s prototype synthetic face detection model achieved 99.7 percent accuracy, Eskandari says. A production deepfake detector is also in the works. So far Paravision reports success with generalized datasets of above 96 percent, without incorporating any off-the-shelf models.
Is face now the most accurate biometric?
Klare courted controversy by comparing 1:1 matching rates for various biometric modalities in different NIST publications, and asking if face is now the world’s most accurate modality. The other modalities are at least not improving at the same pace, he argues.
Part of the reason for this improvement is that faces, Klare reasons, are the least private piece of information about a people. They are more easily found than names, and images of faces are available at volumes an order of magnitude higher than fingerprint or iris images.
As a result, with some caveats, algorithms are now much more accurate than humans, even super-recognizers, at matching faces, Klare says.
Human intervention is still important, however, as use cases are set to expand, and because of remaining challenges like identical twins.
Reviewing the importance of presentation attack detection for various use cases, Klare suggests replacing the lengthy terms bona fide presentation classification error rate (BPCER) and attack presentation classification error rate (APCER) with “genuine reject rate” and “spoof accept rate” respectively.
Another area for potential improvement of facial recognition identified by Klare is make-up. This also influences demographic disparities between match rates for men and women.
Airport image capture improvements
Image acquisition technology is also improving rapidly. The design constraints that were imposed on early e-gates, explained by Gentric, make them large and complicated compared with the sleek machines used today. This was necessary, due to the low tolerance face biometrics systems had for face pose and lighting.
Airport biometrics as processed today may soon be obsolete as well, however, with improvements in face biometrics capture of subjects on the move, Gentric says. Image selection for quality dramatically improves the accuracy of on-the-move systems, and pedestrian detection or skeleton tracking further improves performance by grouping images of a given identity together.
E-gates may also soon be able to match a parent along with a child in their arms at the same time.
3D depth maps inferred from RGB camera streams can help ensure images are captured at an appropriate distance.