FB pixel

Baidu researchers compare voice cloning methods

 

Scientists with Baidu Research’s Deep Voice project has published a new study on the relative merits of “speaker adaptation” and “speaker encoding” as voice cloning methods.

Neural Voice Cloning with a Few Samples” (PDF) suggests that the different strengths of the two methods make each one appropriate for certain applications.

In speaker adaptation, a multi-speaker generative model is fine-tuned by applying backpropogation-based optimization to several cloning samples. This method enables speaker representation with a lower number of parameters, with the trade-offs of longer cloning time and lower audio quality.

Speaker encoding, in which a separate model is trained to directly infer a new speaker embedding, involves retrieving speaker identity information from each audio sample with “time-and-frequency-domain processing blocks.” This enables fast cloning time with a low number of parameters, which the researchers say makes it favorable for low-resource deployments.

The researchers expect voice cloning to be used for personalizing human-machine interactions. With voice authentication applications increasing in number and scale, it could also force those applications to use other methods and modalities, such as behavioral biometrics, to supplement voice recognition.

Article Topics

 |   | 

Latest Biometrics News

 

Human super-recognizers teach AI how to recognize faces in new study

You might know someone who struggles to recognize people, even if they’re famous and on TV all the time. On…

 

Biometrics testing, more user control contrast with US surveillance expansion

Biometrics and digital identity technologies and policies are being upgraded by providers and implementers to increase trust, as seen in…

 

Sri Lanka digital ID launch by March 2026: President

Sri Lanka has set plans to launch the first digital ID by March next year, President Anura Kumara Dissanayake stated….

 

Former Microsoft CSO named Princeton Identity Executive Advisor

Brian K. Tuskan, former Chief Security Officer for Microsoft and ServiceNow, has joined Princeton Identity as its newest Executive Advisor….

 

US DoD and Intelligence Community veteran joins ROC Board

ROC has announced the appointment of Brian A. Hibbeln, a 30-year veteran of the Department of Defense and the U.S….

 

With passkey sign-in secured, FIDO Alliance looks to frontier of digital credentials

According to the Passkey Index, a benchmark from the FIDO Alliance, 93 percent of user accounts across member firms are…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events