Facial age estimation adoption puts pressure on ecosystem

As an area of technology that is being implemented even as legal mandates are taking force and the criteria for evaluating its effectiveness is being developed, facial age estimation can be challenging to keep up with.
The European Association for Biometrics held a workshop on age estimation, primarily biometric facial age estimation, this week help stakeholders stay on top of the rapidly changing state of play. Richard Guest, professor at the University of Southampton, moderated the EAB presentation.
Historically, Guest noted, age estimation has relied on anthropology and skeletal analysis, dental analysis and other biological assessments.
Areas for further research and development, including training data collection or creation, performance testing, PAD concerns and explainability.
Improving age estimation, improving evaluations
Ingenium Biometrics Laboratory Co-founder and CTO Chris Allgrove began the main presentation by examining the landscape and state of the art for evaluating age estimation technologies.
The pace of deployment of age estimation, and therefore the demand for evaluation of the systems’ effectiveness, is outpacing the methods and capacity for testing at this point, he says.
Allgrove described how biometric FAE systems tend to use deep learning algorithms to assess visual ageing indicators like skin texture, face shape and structural ratios.
But training algorithms to estimate an age or age range based on these characteristics takes hundreds of thousands of facial images, possibly more, and the composition of the training dataset is integral to system accuracy.
Compiling a good training dataset can pose a significant challenge, due to concerns around children’s data privacy in addition to the usual difficulties collecting the appropriate kinds and diversity of data.
For evaluating these technologies, Allgrove points out that understanding the context of the use case “directly drives how we evaluate it and the key things we care about in that evaluation.”
And there is no finished international standard enshrining criteria for effective age assurance. ISO/IEC 27566 Part 1 (framework) has been published, while Part 2 (technical approaches) and Part 3 (analysis and comparison) are at the committee draft stage, Allgrove says.
Performance is a matter of the accuracy of estimation, which is a continuous measure of estimation error (as expressed by Mean Absolute Error rate), rather than a pass or fail scenario. In addition to MAE, Mean Squared Error (MSE) rates can be measured to put more weight on large errors.
It is also a matter of security against manipulation, which is different from traditional biometric Presentation Attack Detection (PAD) in that the attacker does not need to impersonate a particular individual. While PAD is often considered synonymous with liveness detection, algorithms checking for liveness will inevitably fail in most age estimation scenarios, as a person applying makeup to appear older is presenting a spoof attack, but also a live face. Testing effectiveness against other cosmetic changes, like botox injections, comes with additional complications.
The cohort of tested individuals must be aligned with the intended use case of the age estimation technology, Allgrove emphasizes. A use case centered on teens will not be well-served by an evaluation of the effectiveness of estimating the ages of middle-aged people, for example.
NIST’s Patrick Grother described how to understand the organization’s evaluations of FAE.
Face Analysis Technology Evaluation benchmark for age estimation, which provides free, independent, regular and repeatable evaluations of accuracy, both in absolute terms and relative to other FAE vendors.
FATE: Age Estimation evaluates algorithms on millions of images, but for the most part, not selfies. Grother notes that this can be considered a limitation, but at the same time age estimation should not be reliant on a particular type of image or photographic circumstances.
Grother shared data on the impact that “nuisance factors,” like the subject’s activity, have on estimates. Interpersonal variation based on the factors can be quite significant, further complicating the task of comparing one system to another.
He also discussed how NIST handles variables like demographics. While skin tone is often used as a proxy for ethnicity, Grother says, there are many phenotypes that effect age estimation and are related to ethnicity.
The algorithms have generally been improving, and Grother sees potential for further accuracy improvements, as demonstrated by the results of a fusion technique in which the best FAE systems are combined.
Andrew Hammond of KJR shared an overview of Australia’s Age Assurance Technology Trial (AATT).
Testing was carried out from February to June of 2025 in a phased manner. Hammond noted that during testing in schools, a number of prospective participants opted out of the trial, but also that a significant portion of them ultimately participated, after watching their peers go through the early stages of the process.
An interdisciplinary issue
Dr. Eva Lievens of Ghent University presented a legal perspective on age estimation and so-called “social media bans.”
She reviewed the march of age restrictions for social media use around the world, including Australia’s policy implemented last December and progress on legislation in most European countries to align with Article 28 of the Digital Services Act.
The day after the presentation, the UK government announced a pilot of bans, time limits and digital curfews for social media use by teenagers in 300 homes to inform its national online safety consultation.
Meanwhile, evidence that social media use by teenagers is associated with anxiety and depression continues to mount. The Guardian reports that researchers at Imperial College London made the link, based on cognitive tests and questionnaires, between using social media for more than three hours a day and negative outcomes. A lack of sleep may be the causal link between the app use and mental health effects.
Eleanor Johnston of Ofcom spoke about the UK’s efforts to regulate social media in the concluding presentation of the workshop.
Lievens urged the community to take the views of children themselves into account as they balance the various concerns behind age assurance policies.
Estimation and verification
Cognitec Research Scientist Dr. Christopher Gaul discussed the difference between facial age estimation and using face biometrics to deliver a binary (yes or no) judgement.
Because age estimations come with a confidence range, the relevant metric for detecting underage users is the probability that range indicates that the user is below the threshold, Gaul points out, not the estimated age as such.
Cognitec’s age estimation delivers a higher rate of false underage detections than its binary over-18 detector, which Gaul refers to as age verification.
Further, age estimation algorithms show an unsurprising bias to the mean, with the age of younger people more often overestimated, and older people’s age underestimated. One implication is that if FAE is used alone, degrading the quality of the probe image will tend to push the algorithm towards the mean, providing a means for users to spoof the system by taking low-quality photos.
Taken together, the presentations describe a landscape of rapidly shifting legal and regulatory responsibilities combined with cutting-edge biometric technology development. Data for facial age estimation is rolling in, standards are coming and evaluations are getting more fine-grained. But with questions arising as quickly as they are answered, it is a field that will keep researchers and policy analysts busy for the foreseeable future.
Article Topics
age verification | biometric age estimation | biometrics research | Cognitec | EAB | European Association for Biometrics | facial age estimation (FAE) | Ingenium Biometrics | KJR Testing | NIST





Comments