FB pixel

ByteDance releases new generative AI model OmniHuman

ByteDance releases new generative AI model OmniHuman
 

Chinese tech company ByteDance has come up with a generative AI framework that can create highly realistic videos of a human based on a single image and motion signal called OmniHuman-1.

ByteDance’s researchers demonstrated the technology by generating several realistic human videos, including Albert Einstein and Nvidia CEO Jensen Huang. The videos show humans talking and singing in challenging body positions, including using their hands, and in different aspect ratios such as portraits, half-body and full-body. The system can also animate cartoons.

The company behind TikTok says that the framework beats existing technology which is still struggling to scale beyond animating faces or upper bodies, limiting their potential in real applications. OmniHuman outperforms existing methods because it can generate extremely realistic human videos based on weak signal inputs, especially audio, according to a research paper published by the company.

“In OmniHuman, we introduce a multimodality motion conditioning mixed training strategy, allowing the model to benefit from data scaling up of mixed conditioning,” the researchers write. “This overcomes the issue that previous end-to-end approaches faced due to the scarcity of high-quality data.”

The researchers relied on more than 18,000 hours of human-related data for training the framework, allowing it to learn from text, audio, and body movements. This resulted in more natural-looking human videos.

“Our key insight is that incorporating multiple conditioning signals, such as text, audio and pose, during training can significantly reduce data wastage,” says the paper.

The system initially handles each input type independently, condensing movement details from text descriptions, reference images, audio signals and movement data into a compact format. It then progressively enhances this data into realistic video output, refining motion generation by comparing its results with real videos.

ByteDance has been investing in AI video generation, rivaling firms such as Meta, Microsoft and Google DeepMind. In January, the company released an upgrade to its AI model Doubao, claiming it outperforms OpenAI’s o1 benchmark test AIME.

Related Posts

Article Topics

 |   |   |   |   | 

Latest Biometrics News

 

Face biometrics use cases outnumbered only by important considerations

With face biometrics now used regularly in many different sectors and areas of life, stakeholders are asking questions about a…

 

Biometric Update Podcast explores identification at scale using browser fingerprinting

“Browser fingerprinting is this idea that modern browsers are so complex.” So says Valentin Vasilyev, Chief Technology Officer of Fingerprint,…

 

Passkeys now pervasive but passwords persist in enterprise authentication

Passkeys are here; now about those passwords. Specifically, passkeys are now prevalent in the enterprise, the FIDO Alliance says, with…

 

Pornhub returns to UK, but only for iOS users who verify age with Apple

In the UK, “wanker” is not typically a term of endearment. However, the case may be different for Pornhub, which…

 

Europol operated ‘shadow’ IT systems without data safeguards: Report

Europol has operated secret data analysis platforms containing large amounts of personal information, such as identity documents, without the security…

 

EU pushes AI Act deadlines for high-risk systems, including biometrics

The EU has reached a provisional agreement on changes to the AI Act that postpone rules on high-risk AI systems,…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events