Microsoft teases easy video deepfake tool, declines to release it

Generative AI video engine brings potential applications and risks

Apr 22, 2024, 5:36 pm EDT | Joel R. McConvey

Categories Biometric R&D | Biometrics News

Microsoft teases easy video deepfake tool, declines to release it

Microsoft is the latest tech giant to tease an AI product so good at producing deepfake humans that it poses a threat to real ones. In a striking demonstration of how quickly generative AI is advancing, VASA-1 can generate “hyper-realistic talking face video” from nothing but a single static image, an audio clip and a text script. A research paper from Microsoft says VASA-1 produces “lip movements that are exquisitely synchronized with the audio,” plus “a large spectrum of facial nuances and natural head motions that contribute to the perception of authenticity and liveliness.”

Dozens of accompanying video samples illustrate this capability, applied to both real humans and artificial faces (in one particularly jarring instance, Da Vinci’s Mona Lisa convincingly raps a verse by Anne Hathaway). Other demos showcase the AI’s ability to make faces sing, speak in different languages, and otherwise handle photo and audio inputs from outside the training set. Many of the videos are so realistic that most casual viewers would never think to question their authenticity.

If released to the public, VASA-1 would give just about anyone the ability to create deepfake videos with a single photo and a minimal amount of audio input. Microsoft purports to know this. Its release says its research “focuses on generating visual affective skills for virtual AI avatars, aiming for positive applications. It is not intended to create content that is used to mislead or deceive. However,” Microsoft concedes, “like other related content generation techniques, it could still potentially be misused for impersonating humans.”

“Given such context, we have no plans to release an online demo, API, product, additional implementation details, or any related offerings until we are certain that the technology will be used responsibly and in accordance with proper regulations.”

Microsoft’s caution belies enthusiasm for generative AI’s potential

As pointed out by some observers, creating a powerful video cloning tool and saying it should not be used to create deepfakes is a bit like inventing dynamite and saying it could be misused to blow things up. Microsoft’s intention in announcing VASA-1 and outlining its capabilities is surely not to apologize. Its own language makes clear how the company weighs AI’s risks against its benefits: “while acknowledging the possibility of misuse, it’s imperative to recognize the substantial positive potential of our technique,” it says. “The benefits – such as enhancing educational equity, improving accessibility for individuals with communication challenges, offering companionship or therapeutic support to those in need, among many others – underscore the importance of our research and other related explorations.”

Kevin Surace, chair of biometric authentication firm Token, agrees – to a point. “The implications for personalizing emails and other business mass communication is fabulous,” he says in an article in The Register. “Even animating older pictures as well. To some extent this is just fun and to another it has solid business applications we will all use in the coming months and years.”

Yet for the biometrics industry and its associated regulatory circles, the technology and the speed at which it is evolving also pose serious questions about the reliability of existing systems. Deepfakes generated using VASA-1 and other AI spoofing tools could be used to trick facial recognition systems.

One of VASA-1’s major jumps is being able to create faces with “appealing visual affective skills”. Visual affective skills (VAS) are what let us perceive and interpret emotions through visual stimuli, such as facial expressions and body language. For VASA-1, those skills are reversed to apply to a fake video avatar’s ability to evoke emotion in a viewer. Per Microsoft, “the core innovations include a diffusion-based holistic facial dynamics and head movement generation model that works in a face latent space, and the development of such an expressive and disentangled face latent space using videos.”

In other words, the algorithm reduces noise while adding detail, and captures the movement of the whole face and head as a single unit rather than disparate elements, which is expressed in highly refined and modular code.

Regulating generative AI models could be very difficult

Writing for The Register, Thomas Claburn says VASA-1 is the kind of threat that has governments scrambling to enact regulations. “These AI-generated videos, in which people can be convincingly animated to speak scripted words in a cloned voice, are just the sort of thing the U.S. Federal Trade Commission warned about last month, after previously proposing a rule to prevent AI technology from being used for impersonation fraud,” writes Claburn.

For his part, Surace believes that, despite the wave of AI-focused laws popping up around the globe, regulatory measures may end up being merely decorative.

“Microsoft and others have held back for now until they work out the privacy and usage issues,” he says. “How will anyone regulate who uses this for the right reasons? Because of the open source nature of the space, regulating it will be impossible in any case.”

Article Topics

Microsoft teases easy video deepfake tool, declines to release it

Microsoft’s caution belies enthusiasm for generative AI’s potential

Regulating generative AI models could be very difficult

Article Topics

Comments

Leave a ReplyCancel reply

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events

Microsoft teases easy video deepfake tool, declines to release it

Microsoft’s caution belies enthusiasm for generative AI’s potential

Regulating generative AI models could be very difficult

Related Posts

Article Topics

Latest Biometrics News

US lawmakers move to restrict AI chatbots used by kids

Utah age assurance law for VPN users takes effect this week

CLR Labs wins ISO 17025 accreditation for biometrics testing across EU

Leidos, Idemia PS advance checkpoint modernization with biometrics, CAT-2 systems

OpenAI rolls out passkeys for ChatGPT, partners with Yubico

Google Wallet supports Aadhaar verifiable credentials in India

Comments

Leave a ReplyCancel reply

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events