Europe’s ‘Wild West’ of facial recognition testing spurs call for responsible AI oversight

A peer-reviewed study published in Data & Policy by legal scholar Karen Yeung of the University of Birmingham, and Wenlong Li of Zhejiang University, warns that the testing of AI systems “in the wild” remains a largely ungoverned frontier – a “wild west” where police experiments with live facial recognition have repeatedly breached ethical and legal boundaries.
The study examines four European case studies in London, South Wales, Berlin, and Nice to expose what the study’s authors describe as a “serious governance deficit” in real-world AI trials conducted by law enforcement between 2016 and 2020.
Yeung and Li argue that future AI testing, especially under the European Union’s (EU) Artificial Intelligence Act, must be grounded in three types of responsibility: epistemic, ethical, and legal. The paper calls for a principled framework that ensures AI systems are tested with scientific validity, moral integrity, and respect for fundamental rights.
The authors note that while industries such as automotive and digital marketing have long embraced “in-the-wild” testing, the unregulated experimentation of AI tools – especially those capable of surveillance or prediction – poses more insidious threats.
Yeung and Li compared the lack of oversight in AI field testing to early nuclear experiments or biomedical studies carried out without consent. They wrote that “AI testing has become a vacuum of accountability,” whereas systems capable of mass data collection and discrimination are quietly deployed on unwitting publics.
Unlike clinical or academic research, AI field trials often proceed without clear hypotheses, ethical review, or informed consent, creating a situation in which even well-intentioned pilots can become rights violations.
The paper’s detailed case studies reveal a pattern of inconsistency and opacity. In London and South Wales, police trials blurred the line between research and live policing operations.
The Metropolitan Police Service and South Wales Police used NEC’s NeoFace biometric software to scan tens of thousands of passers-by at events such as the Notting Hill Carnival and Cardiff City football matches.
Although the agencies framed these exercises as “trials,” they led to real arrests and legal consequences. Police relied on broad “common-law powers” rather than consent or specific statutory authorization.
Independent evaluations by researchers at Essex and Cardiff universities found major flaws in transparency, community engagement, and accuracy. The Bridges v Chief Constable of South Wales Police decision in 2020 eventually ruled that the Welsh trials violated data-protection, equality, and human-rights law, marking a turning point in the debate over police use of facial recognition technology
In Germany, the Federal Police partnered with Deutsche Bahn to conduct trials at Berlin Südkreuz station between 2017 and 2018. Hundreds of volunteers took part as the government tested facial recognition software from several manufacturers.
Participants gave consent, but ordinary passengers were also recorded. Critics labeled the project a “trial in hiding,” arguing that nonparticipants were unknowingly subjected to biometric scanning.
Germany’s Federal Commissioner for Data Protection approved the study only on condition of voluntary participation, yet public outcry followed revelations that the system also collected ancillary data about volunteers such as movement speed and body temperature.
The Interior Ministry ultimately postponed nationwide adoption, concluding that ethical and legal questions remained unresolved
In France, the city of Nice hosted a smaller trial during the 2019 Carnival, partnering with the Monaco-based cybersecurity company Confidentia to test AnyVision software.
Volunteers uploaded selfies to create a watchlist, while others entered through camera-equipped gates after being notified by signage. Despite the tiny sample size of eight volunteers, the city declared the test a success.
The French data-protection authority Commission Nationale de l’Informatique et des Libertés allowed the experiment under strict conditions but emphasized that future deployments at scale would require explicit legal authorization. The mayor of Nice, Christian Estrosi, celebrated the project as a model of innovation, while privacy advocates saw it as a troubling rehearsal for nationwide biometric surveillance
Across all four jurisdictions, Yeung and Li found that police agencies failed to articulate clear objectives, applied inconsistent data-handling standards, and neglected methodological rigor. The trials often conflated software performance testing with full-scale policing operations, undermining both scientific reliability and public trust.
In London and Wales, officers reportedly interpreted citizens who avoided cameras as suspicious, an approach the authors describe as “a basic failure to recognize that individuals are legally entitled to preserve their privacy.”
In several cases, police watchlists included not only wanted persons but also “individuals deemed vulnerable or merely of interest,” raising serious proportionality issues under Article 8 of the European Convention on Human Rights.
Yeung and Li’s proposed framework for responsible testing calls first for epistemic responsibility which demands that trials have clear hypotheses, rigorous design, and transparent error measurement to ensure findings are scientifically valid.
Ethical responsibility requires safeguards for participants and affected communities, informed consent where feasible, and careful assessment of potential harms such as bias or the chilling of public protest.
Legal responsibility means compliance with existing data-protection, equality, and human-rights law, combined with independent oversight and full public transparency.
These principles, Yeung and Li argue, should guide the European Commission as it drafts detailed provisions under Article 60 of the EU AI Act, which will require pre-approved “real-world testing plans” for high-risk AI systems.
The authors concluded that the live facial recognition experiments of the 2010s foreshadowed the dilemmas of technologies tested on citizens without consent, oversight, or evidence of benefit now confronting the AI age.
“Principled frameworks to ensure that live technology testing, including AI systems, is undertaken responsibly are long overdue and urgently needed,” Yeung and Li’s wrote.
By converting Europe’s chaotic “wild west” of AI experimentation into a regime of accountable, rights-based oversight, governments can preserve innovation while protecting the public from the next generation of unregulated surveillance, Yeung and Li stated.
Article Topics
biometrics | Europe | facial recognition | London Metropolitan Police | police | real-time biometrics | responsible AI | responsible biometrics | South Wales Police







Comments