Research paper reveals deepfake technique that can deceive presentation attack detection tools
According to the study led by Shehzeen Hussain, a UC San Diego computer engineering Ph.D. student, PAD can be defeated by inserting slightly manipulated inputs called adversarial examples into every video frame.
This would cause artificial intelligence systems to make a mistake, even when an adversary may not be aware of the inner workings of the machine learning model used by the detector.
In fact, the reported attack’s success rate of these experiments reached above 99 percent for uncompressed videos and 84.96 percent for compressed videos in a scenario where the attackers have complete access to the detector model.
Even in experiments where attackers could only query the machine learning model, however, the attacks’ success rates were still consistently high, with 86.43 percent for uncompressed and 78.33 percent for compressed videos.
Deepfake detectors focus on faces in videos by analyzing biometrics and other key elements of the footage that are traditionally considered the easiest to spot, such as unnatural blinking, then try to remove or unmask the attack through compression and resizing techniques.
The newly-developed adversarial examples created for every face in the video frame are resilient to compressing and resizing operations, however, and can also be applied on detectors operating on entire video frames as opposed to just face crops.
The attack algorithm manages to bypass these operations by automatically estimating a set of input transformations about whether the model ranks images as real or fake. The estimation is then used to transform images in order to keep the adversarial artifact effective even after compression and decompression.
Finally, the modified version of the face is inserted in every video frame to create a PAD-resilient deepfake video.
Hussain’s team did not release the source code behind the new technique to avoid it being used by malicious attackers.