Some surprising questions, fewer insights after Deepfake Detection Challenge
Few if any unique or actionable insights have resulted from a lengthy contest created by Facebook to defang deepfake content, which threatens to erode societal trust in knowledge, information and legitimate authority.
It is not even clear how to interpret some of the more significant outcomes flowing from Facebook’s $1 million Deepfake Detection Challenge.
The best that can be said right now is that the winning detection software correctly determined real video and deepfakes an average of just 65 percent of the time against a black-box data set. This data set was not available to entrants; their algorithms encountered unknown circumstances
A public data set was distributed to entrants who used it to train models for circumstances it contained. The best that an algorithm could do against the public data was an average of 82.56 percent — “a common accuracy measure for computer vision tasks,” according to Facebook.
The top-ranking model was written by Selim Seferbekov, a computer vision engineer at Foundry Group-backed Mapbox, who lives in Belarus.
It is notable that the third-best black-box score was NtechLab, a Russian facial recognition firm that has courted controversy throughout its existence, first creating a dating app that encouraged users to take pictures of anyone in sight to match against social media databases. Today, it is using its AI skills to scan faces found in real time by the tens of thousands of surveillance cameras in Moscow.
No. 6 was Konstantin Simonchik, co-founder of ID R&D, a venture-backed New York-based firm using biometric authentication for fraud prevention. Eighth on the list is, in fact, ID R&D.
The experiment was begun in December by the social media icon, Amazon Web Services, Microsoft Corp. and Partnership on AI, a nonprofit coalition advocating for reins on the algorithms. Last week, Facebook executives began releasing some results. More are expected this week at the Computer Vision and Pattern Recognition conference.
Ultimately, 35,109 training models were submitted by 2,114 participants to analyze 115,000 challenge videos. Short performances by about 3,500 paid actors comprising 38.5 days of data make up the original, unaltered experiment data.
In a Fortune article discussing results of the contest, it is pointed out that it is not known why some algorithms that performed well with the public dataset could not match their success with the private data set.
One guess is that “there were probably subtle differences between the videos Facebook created for the competition and genuine deepfakes that the (contestants’) algorithms couldn’t handle,” according to the article.
It also is noted that no winning algorithms used common digital forensic methods in analyzing clips. Those methods include such basic techniques as looking for metadata and other indications that an image was, indeed, created by a camera.
Apparently, it is not known if entrants dismissed them as not worth inclusion or that the entrants, who are among the best in machine learning, do not know about such basic tools.
There is a temptation to brush off concern about the challenge’s vague outcomes as those derived during the early days of a new software revolution.
But the quarry — highly realistic images and videos of synthetic people — is hardly older than the hunting tools. What is more, political forces in the U.S. and around the world seem to be working continuously to discredit all forms of information and knowledge.