FB pixel

OpenAI says its voice cloning tool is too effective for public release

OpenAI says its voice cloning tool is too effective for public release
 

In the original story of the genie in a bottle from One Thousand and One Nights, the genie threatens to kill the fisherman who freed him – a tale that seems to be resonating with OpenAI, as it continues to pursue advanced voice cloning and synthetic audio and video tools that it says come with major risks.

In a blog post, the company says results of testing show that its Voice Engine is so good at deepfake voice cloning and synthetic audio that it will almost certainly be misused on wide release, prompting the ChatGPT maker to hold back on setting the product loose until it establishes stronger rules and guidelines for deployment.

Developed in 2022, Voice Engine is an update on tech already used in Open AI’s text-to-speech API and the conversation mode of ChatGPT. The blog says Voice Engine “uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker. It is notable that a small model with a single 15-second sample can create emotive and realistic voices.” The company has not disclosed the source of the emotionally rich data used to train Voice Engine, but told TechCrunch that the model “was trained on a mix of licensed and publicly available data.”

Beginning, perhaps, to understand the full-scale implications of a free, easily accessible tool that can recreate the realistic voice of anyone from whom it has a 15-second sample, the company says it is now “taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse.”

“We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities,” says the blog post. “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

According to a report from ArsTechnica, the current terms and conditions for companies testing Voice Engine prohibit the impersonation of an individual or organization “without consent or legal right.” They mandate clear disclosure of the use of AI to clone voices, and informed consent from anyone whose voice is being cloned. Plus, Open AI uses watermarks to make it easier to identify audio produced using Voice Engine.

Nonetheless, the company makes clear its belief that stopping the generative AI speed train is not an option, and that it is up to society to change with the times. “We hope this preview of Voice Engine both underscores its potential and also motivates the need to bolster societal resilience against the challenges brought by ever more convincing generative models,” says its post. To start, it suggests phasing out voice authentication as a means of ID verification for banking and other sensitive use cases, increasing public education on AI, “exploring policies to protect the use of individuals’ voices in AI” and accelerating the development of liveness detection, watermarking and other tools to distinguish real voices from synthetic cloned audio.

It is worth noting that OpenAI has rung this particular bell before, likewise warning that its facial recognition software and its text-to-video API Sora are so astonishingly good as to be positioned to transform the world.

Related Posts

Article Topics

 |   |   |   |   |   | 

Latest Biometrics News

 

Certainty vs flexibility – does the UK need a Biometric Surveillance Act?

By Professor Fraser Sampson, former UK Biometrics & Surveillance Camera Commissioner Last week London became a city of two tales. Two…

 

TestMu AI releases testing tool for agent-produced code

TestMu AI (formerly LambdaTest) has launched Kane CLI, “a new browser automation tool that runs directly from the terminal,” and…

 

Travel biometrics making new connections

Airport biometrics projects and companies are breaking new ground and intersecting with other industry trends, from digital wallets to biometric…

 

Biometric Update Podcast: Teresa Wu on SIA’s Corporate Credential Design Guide

The Security Industry Association (SIA) has published its Corporate Credential Design Guide, and Idema Public Security’s Teresa Wu, who has…

 

AI agents operating continuously at machine speed are breaking human-centric IAM

New research commissioned by Ping Identity and compiled by KuppingerCole Analysts shows that “agents are being deployed into production faster…

 

Criticism follows inclusion of Madras Security Printers in Sri Lanka digital ID bids

Civil society group the People’s Struggle Alliance (PSA) has raised concerns regarding the inclusion of Madras Security Printers (MSP) in…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events