Researcher shows progress on explainability in biometric PAD systems
Explaining the decisions of artificial intelligence systems like those used in biometric presentation attack detection is a complicated matter, and not just in terms of the technical computer analysis involved, attendees if this week’s lunch talk hosted by the European Association for Biometrics (EAB). There are thorny conceptual challenges around applying human understanding to the decisions of automated systems, and how they are arrived at.
‘’Explaining’ Biometric Recognition and PAD Methods: xAI Tools for Face Biometrics’ was presented by computer vision and biometrics researcher Ana Sequeira of INESC TEC in Porto, Portugal.
Sequeira delved into the need for explainable artificial intelligence, the focus of her research at the R&D institution.
The basic properties required for biometrics to work were reviewed, along with the vulnerabilities of such technologies to presentation attacks. Using ISO/IEC standards as a starting point, Sequeira presents biometric presentation attacks as largely divided between automated attacks, such as synthetic identities, and human attacks, such as lifeless or altered subjects.
A problem that arises immediately in presentation attack detection, Sequeira says, is that of generalization: How can models tell unknown fake sample types from genuine ones?
Analysis of iris and fingerprint biometrics models suggest that they are not effective at identifying new attack types. This is in part, Sequeira argues, because most approaches are based on binary classifications which involve overly simplistic assumptions about attackers, and one class of training data for model design.
Sequeira and her team evaluated PAD models with the spoof tested both used and unused in training, calling the latter an “unseen attack.”
They found that a regularization technique, consisting of adversarial training and transfer learning processes, improved PAD generalization.
“We can provide better separability between bona fide and attacks, but not only that, we can also put the attacks closer, in a sense, that makes our model more robust,” Sequeira, who also previously headed biometric anti-spoofing at IrisGuard, explains.
The research community is now facing questions about what AI models are learning and deciding based on, as society attempts to see inside the AI ‘block box.’
Output as input
Machine learning, Sequeira points out, changes the paradigm of computing, moving the data that was formerly the output created by programs and starting data to the development side, with the program as output. Deep learning is based on representation learning, Sequeira says, and it is important to know that the learned features are relevant, rather than just appearing so during the training process.
As AI is used in applications like healthcare and law enforcement biometrics, Sequeira notes, transparency is necessary, and even required by GDPR.
There is an inherent trade-off, however, between performance and explainability, according to Sequeira. Despite this trade-off, explainability is desirable not just for the sake of transparency, but also for potential gains it could yield, such as in exposing hidden vulnerabilities.
Explainability is also not exactly the same as interpretability, even though the terms are sometimes used interchangeably. There is a consensus in the AI community that explainability in AI can only be achieved through study of the pre-model, in-model and post-model levels, so Sequeira’s team focused on the latter in experiments with face presentation attack detection and heartbeat biometric identification.
AI explainability is traditionally based on metrics for labels, but this approach is limited, and Sequeira asks if the scope of analysis can be expanded.
Class activation maps show the pixel groups used in class prediction, yielding some insight into the algorithmic decision-making process. They show a significant difference in the behavior of the one-attack and unseen-attack models, for example, even when both made the same, correct classification.
In the case of this experiment, the unseen-attack model was trained on all types of attack available except the one it was tested on, Sequeira explains.
The resulting heat-maps show the models making the same judgement based on the same image in several cases in which the class activation maps make clear they are focused on different parts of the picture. This was the case with both correctly classified presentation attack and bona fide samples.
Explanations for samples classified in the same way were significantly closer to each other.
Where you would look on an image to detect a presentation attack seems intuitive. A person would focus on transitions between the artifact, such as a mask, and the real environment around it. The single-attack PAD model focused on a similar area, but the unseen-attack model did not.
Sequeira’s team arrived at a set of four desirable properties for explanations, comparing single and unseen attack models, intra-class coherence, meaningfulness to humans and swaps of training and test data. The criteria raises questions, however, around how ‘similar’ and ‘meaningful’ are defined in this context.
The challenge remaining for the biometrics community, according to Sequeira, is to develop new performance metrics for explanations.