DHS Biometric Rally results show groups pose little challenge to effective matching
Biometric technology is very close to allowing people in airports to pass through identification areas for processing in small groups, based on test results from the U.S. Department of Homeland Security’s Science and Technology Directorate.
The 2022 Biometric Technology Rally results were presented by DHS Biometric and Identity Technology Center Director Arun Vemury, Maryland Test Facility Executive Director Jerry Tipton and Principal Investigator Dr. Yevgeniy Sirotin.
The rallies test dozens of commercial devices in a high throughput unattended use case with naïve volunteers in scenario testing to see how they work together.
The first evaluation back in 2018 focussed only on biometric acquisition systems, but the fifth rally includes multiple modalities, matching algorithms, and additional criteria.
It evaluates not just accuracy, but user satisfaction, time and staffing requirements, equitability and privacy.
The end goals, Vemury says, are “about developing standards, developing measures, developing test methods so we can fairly and objectively evaluate how technologies work.”
Sirotin explained the increasing focus on processing people’s biometrics in groups, as that is often how they travel. The goal was to do so within three seconds, and groups ranged from 2 to 12 people. At the same time, they were asked not to process the images of anyone who had opted out or are outside of the designated capture area.
To test this, DHS arranged people into groups, some of whom were to be processed and some not, and then had the former walk through a center lane, while other passed to either side.
Biometric images were expected to be submitted in under 6 seconds each, on average.
Sixteen total system applied to participate, six acquisition systems and 10 matchers. Of the former, one dropped out before installation, and one during installation.
Efficiency and satisfaction are criteria that only apply to acquisition systems, Sirotin explains. On those counts, the technology did well.
The most efficient system worked in 1.72 seconds for groups of 2, and 1.47 seconds for groups of four. Every system met the efficiency goal for groups of 4.
Two system beat the 90 percent satisfaction threshold, with one (‘Longs’) topping 97 percent.
Similarly for privacy, the goal of 0 percent non-user identifications was met by most combinations of biometric acquisition and matching technologies, and barely missed by the rest.
Later, Vemury emphasized that “face-aware” cameras used for biometrics acquisition provide assurance to privacy by ignoring the areas beyond the face or faces of intended subjects.
DHS set an effectiveness threshold of 95 percent, suggesting that those below may not be suitable for processing people in groups.
Seventeen system combinations met the true identification rate threshold of 95 percent, with the top one hitting 97.4 percent, but none reaching the 99 percent goal. Matching systems, however, were “nearly flawless,” Sirotin says, with few matching less than 99 percent. Group size posed no significant impact.
Issues like twins, one set of which was included in the test, remain challenging, and are reflected in the results.
Out of all 40 system combinations, the majority of errors were in acquisition for 36.
For demographics assessments, self-reporting of gender and race was used, but skin tone was measured.
Thirty-six of 40 matching system met the threshold, and Sirotin noted that one matcher did not do well.
Modest but noticeable differentials were found among matchers, but greater differentials were introduced in acquisition.
Black people in groups of four were matched at the lowest rate, at 91.4 percent. Likewise, those with darker skin tone were matched 91.4 percent of the time in groups of 2, and 88.8 percent in groups of four, in aggregate.
Those differentials were not evenly reflected among all participating vendors, however, as nine systems met the benchmark for all skin tones.
Overall, systems were found to be fast, with high user satisfaction, and effective maintenance of privacy. Group size did not appear to harm effectiveness. The differentials in biometric acquisition, however, remain a concern.
Sirotin lauded the new form factors of devices submitted this year as reflecting innovation by industry to meet the needs of high-throughput use cases.
Vemury said that issues in acquisition is often related more to challenges like lighting than image resolution. These issues have improved over the past several years. Tipton notes that DHS has observed incremental improvement in acquisition. Still, of the errors made, 97 percent were attributed to the camera used.
Neville Pattinson of Thales gave the vendor perspective on the value of the test, and revealed that the company provided two separate acquisition systems, one with two cameras and one with three cameras. Thales plans to reveal its alias and performance in the near future.
Thales changed the camera technology it uses in acquisition during development, having observed some of the same challenges reflected in the Rally results.
Towards the end of the presentation, Vemury also previewed the upcoming DHS RIVTD challenge.
Applications for liveness and PAD track due September 13, and will utilize over 1,000 fake IDs, in addition to genuine subjects. Some of those fakes are among the best ever confiscated, and in some cases are very difficult for vendors to test their technology against, because possessing them is against the law.