Cops made poor training wheels for AI in UK face biometrics trials
Autonomous face biometrics does not have to be perfect yet, say proponents. Especially with systems used by police, humans can be active partners until algorithms are perfected.
That is an attractive argument for vendors and nervous politicians trying to assuage growing public distrust of facial recognition and advocates. But opponents argue that this is not true now, and may never be.
A new study supported by grants from the UK government South Wales Police and published in The British Journal of Criminology makes the case that based on two large UK system trials, fallible humans today significantly influence biometric algorithms — and not always for the better.
The biases begin with how AIs are created, of course, but they also play integral roles in the operation of so-called assisted systems. Everything from where to set up cameras in public spaces to who gets picked up for questioning.
In fact, given the myriad variables involved in the use of face biometrics by police and, indeed, involved in competent and fair policing (which successful software would obviate), the autonomous system closest to perfect might still be on the most distant horizon.
In this study, algorithms were trained to spot people in crowds and issue an alert if they could match a captured face with their data sets of suspects.
The study’s authors described numerous points where humans in trials had to put their finger on the scales.
Looking at success rates for many event deployments, they wrote, “results follow no particular pattern.” The variability is largely the result of human involvement, according to the study.
The numbers for one trial in 2018 to 2019, were not encouraging. The number of system matches that an officer deemed credible ran from 33 percent to 100 percent. But the total matches proved correct after a subject’s ID was checked was zero on two dates and peaked at 44 percent.
“[O]ne striking finding is the tendency of officers to agree with (cede discretion to) the algorithm and the high chance that computer-generated matches would not be verifiably correct,” the authors wrote.
On the other hand, there were observed instances of officers discounting alerts after a false positive. That means their professional suspicion shifted measurably away from the faces in the crowd due to the algorithm.
In all cases, watch lists of previously convicted criminals, suspects and persons of interest were hand picked by officers. It was up to officers to decide what degree of criminality qualified a person for inclusion on a watch list.
The term person of interest itself is subjective and would differ among officers.
It is possible that officers already knew who might pop up at a given event, reducing randomness but also biasing algorithms toward those with criminal records, people of color and younger people.
The police could change the threshold at which alerts were issued, too. Matches were scored on a range from zero to one, and those scores, according to the report, directly influenced human decisions.
The higher the score, the more likely an officer was to assume an algorithm was correct in issuing the alert. It was black-and-white reasoning using a scale that is designed to be a gradient consulting tool.
Siting cameras should be straight forward enough, but not only is it impossible for an algorithm today to position lenses, systems still demand minimum conditions, including adequate lighting and views of faces.
One event targeted stretched after dusk, according to the study, when pick-pockets were most active. Needing more light for the system, officers decided to monitor during the day instead.
In another instance, cameras were placed in a mall a half-mile from a crime hot spot because it was easier to do so.
Then there was event at which cameras were placed where police could quickly nab suspects, a spot distant from where most of the crimes were occurring.
And in an example of how unpredictably fickle human influence can be when deploying face biometrics, the authors recounted an instance when a camera was placed to best spot enrolled faces — the watch list — during an event. An unsuspecting flag vendor set up a stand too near the camera, and a fluttering Italian flag occasionally interfered with system’s line of sight.
Officers considered asking the entrepreneur to move, but ultimately decided not to because people looked up at the flag, affording the system and operators a better (intermittent) view of their faces.
The degree of decision-making and effort required by police to achieve reasonably effective live facial recognition deployments is such that the technology’s role is better understood as “assisted,” rather than “automated facial recognition,” the report concludes.