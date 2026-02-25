A research collaboration between PXL Vision, the Idiap Research Institute and Swiss innovation agency Innosuisse has led to the identity verification provider launching a new deepfake detector to spot fake or manipulated faces in identity documents.

The detection work was conducted as part of the project ROSALIND (106.729 IP-ICT), funded by Innosuisse in collaboration with PXL Vision. From PXL’s perspective, the collaboration has resulted in advanced deepfake detection capabilities designed specifically for automated IDV, which have already been fully integrated into the PXL Ident platform.

The new capabilities enable face swapping and face reenactment detection, as well as detection of fully synthetic identities created using generative AI. In a LinkedIn post, the firm says it aims to “increase resilience against AI-driven identity attacks and provide greater confidence in compliance-critical onboarding processes, without creating additional friction for users.”

In a press release, Idiap’s Sébastien Marcel says “this achievement reflects Idiap’s long-standing expertise in biometrics, AI security and trusted digital identity. Our collaboration with PXL Vision has been both productive and inspiring.”

“While our experience with identity documents was limited, we were able to build on PXL’s strong expertise in this area. The result achieved has exceeded our initial expectation, and we look forward to continuing to push the boundaries of robust, privacy-compliant identity verification in the future.”

Last year Idiap and PXL Vision hosted a competition that challenged participants to detect injection attacks in identity documents.

Detecting deepfaked text on ID documents

Another outcome from the research is a new paper from the Biometrics Security & Privacy group at Idiap looking at “Detecting Text Manipulation in Images using Vision Language Models.”

“Recent works have shown the effectiveness of Large Vision Language Models (VLMs or LVLMs) in image manipulation detection,” says the abstract. “However, text manipulation detection is largely missing in these studies.” The research aims to close the knowledge gap by analyzing closed and open-source VLMs on different text manipulation datasets.

Rapid progress in the quality of AI-assisted image generation means manipulated images have become more difficult to visually identify, says the paper, “especially when small but semantically critical regions, such as text, are modified. Detecting such subtle changes is challenging and current image forgery detection methods often overlook the manipulated text regions.”

The authors of the paper, Vidit Vidit, Pavel Korshunov, Amir Mohammadi, Christophe Ecabert, Ketan Kotwal and Sébastien Marcel, find that “open-source models are getting closer, but still behind closed-source ones like GPT4o.”

The research benchmarks image manipulation detection algorithms against OSTF and FantasyID datasets, and concludes that “GPT-4o is much better than open source models. Qwen-2.5 is the best performing open source model. Specialised VLMs like FakeShield and SIDA fail to generalize to the new task of text manipulation detection.” This latter issue is labeled “the generalization problem” and defined as follows: “In their training and evaluation datasets,” VLMs “rarely see text manipulation, making them biased towards artifacts of the scene manipulations.”

In other words, detectors are paying attention to the pictures, but ignoring the words.

Article Topics

biometrics | biometrics research | deepfake detection | deepfakes | face biometrics | Idiap | PXL Vision | Swiss Center for Biometrics Research and Testing