FB pixel

Microsoft teases easy video deepfake tool, declines to release it

Generative AI video engine brings potential applications and risks
Categories Biometric R&D  |  Biometrics News
Microsoft teases easy video deepfake tool, declines to release it

Microsoft is the latest tech giant to tease an AI product so good at producing deepfake humans that it poses a threat to real ones. In a striking demonstration of how quickly generative AI is advancing, VASA-1 can generate “hyper-realistic talking face video” from nothing but a single static image, an audio clip and a text script. A research paper from Microsoft says VASA-1 produces “lip movements that are exquisitely synchronized with the audio,” plus “a large spectrum of facial nuances and natural head motions that contribute to the perception of authenticity and liveliness.”

Dozens of accompanying video samples illustrate this capability, applied to both real humans and artificial faces (in one particularly jarring instance, Da Vinci’s Mona Lisa convincingly raps a verse by Anne Hathaway). Other demos showcase the AI’s ability to make faces sing, speak in different languages, and otherwise handle photo and audio inputs from outside the training set. Many of the videos are so realistic that most casual viewers would never think to question their authenticity.

If released to the public, VASA-1 would give just about anyone the ability to create deepfake videos with a single photo and a minimal amount of audio input. Microsoft purports to know this. Its release says its research “focuses on generating visual affective skills for virtual AI avatars, aiming for positive applications. It is not intended to create content that is used to mislead or deceive. However,” Microsoft concedes, “like other related content generation techniques, it could still potentially be misused for impersonating humans.”

“Given such context, we have no plans to release an online demo, API, product, additional implementation details, or any related offerings until we are certain that the technology will be used responsibly and in accordance with proper regulations.”

Microsoft’s caution belies enthusiasm for generative AI’s potential

As pointed out by some observers, creating a powerful video cloning tool and saying it should not be used to create deepfakes is a bit like inventing dynamite and saying it could be misused to blow things up. Microsoft’s intention in announcing VASA-1 and outlining its capabilities is surely not to apologize. Its own language makes clear how the company weighs AI’s risks against its benefits: “while acknowledging the possibility of misuse, it’s imperative to recognize the substantial positive potential of our technique,” it says. “The benefits – such as enhancing educational equity, improving accessibility for individuals with communication challenges, offering companionship or therapeutic support to those in need, among many others – underscore the importance of our research and other related explorations.”

Kevin Surace, chair of biometric authentication firm Token, agrees – to a point. “The implications for personalizing emails and other business mass communication is fabulous,” he says in an article in The Register. “Even animating older pictures as well. To some extent this is just fun and to another it has solid business applications we will all use in the coming months and years.”

Yet for the biometrics industry and its associated regulatory circles, the technology and the speed at which it is evolving also pose serious questions about the reliability of existing systems. Deepfakes generated using VASA-1 and other AI spoofing tools could be used to trick facial recognition systems.

One of VASA-1’s major jumps is being able to create faces with “appealing visual affective skills”. Visual affective skills (VAS) are what let us perceive and interpret emotions through visual stimuli, such as facial expressions and body language. For VASA-1, those skills are reversed to apply to a fake video avatar’s ability to evoke emotion in a viewer. Per Microsoft, “the core innovations include a diffusion-based holistic facial dynamics and head movement generation model that works in a face latent space, and the development of such an expressive and disentangled face latent space using videos.”

In other words, the algorithm reduces noise while adding detail, and captures the movement of the whole face and head as a single unit rather than disparate elements, which is expressed in highly refined and modular code.

Regulating generative AI models could be very difficult

Writing for The Register, Thomas Claburn says VASA-1 is the kind of threat that has governments scrambling to enact regulations. “These AI-generated videos, in which people can be convincingly animated to speak scripted words in a cloned voice, are just the sort of thing the U.S. Federal Trade Commission warned about last month, after previously proposing a rule to prevent AI technology from being used for impersonation fraud,” writes Claburn.

For his part, Surace believes that, despite the wave of AI-focused laws popping up around the globe, regulatory measures may end up being merely decorative.

“Microsoft and others have held back for now until they work out the privacy and usage issues,” he says. “How will anyone regulate who uses this for the right reasons? Because of the open source nature of the space, regulating it will be impossible in any case.”

Related Posts

Article Topics

 |   |   |   |   |   | 

Latest Biometrics News


Who is looking out for your data? Security in an era of wide-spread breaches

By Vince Graziani, CEO, Idex Biometrics While some of the biggest businesses in the world now rely heavily on data, concern…


ITL’s Alerts App expands biometric portfolio to integrated venue management

Businesses from every sector all face access control challenges to ensure the security and safety of their staff and customers….


Best biometrics use cases become clearer as ecosystems mature

Biometrics are for digital identity, socio-economic development, air travel and remote identity verification, but not public surveillance, the most-read news…


UK Biometrics and Surveillance Camera Commissioner role survives as DPDI fails

UK parliament will not pass data protection legislation during the current session, following the announcement of the general election in…


EU watchdog rules airport biometrics must be passenger-controlled to comply with GDPR

The use of facial recognition to streamline air passenger’s travel journeys only complies with Europe’s data protection regulations in certain…


NZ’s biometric code of practice could worsen privacy: Business group

New Zealand is working on creating a biometrics Code of Practice as the country introduces more facial recognition applications. A…


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Read This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events