FB pixel

Baidu researchers compare voice cloning methods


Scientists with Baidu Research’s Deep Voice project has published a new study on the relative merits of “speaker adaptation” and “speaker encoding” as voice cloning methods.

Neural Voice Cloning with a Few Samples” (PDF) suggests that the different strengths of the two methods make each one appropriate for certain applications.

In speaker adaptation, a multi-speaker generative model is fine-tuned by applying backpropogation-based optimization to several cloning samples. This method enables speaker representation with a lower number of parameters, with the trade-offs of longer cloning time and lower audio quality.

Speaker encoding, in which a separate model is trained to directly infer a new speaker embedding, involves retrieving speaker identity information from each audio sample with “time-and-frequency-domain processing blocks.” This enables fast cloning time with a low number of parameters, which the researchers say makes it favorable for low-resource deployments.

The researchers expect voice cloning to be used for personalizing human-machine interactions. With voice authentication applications increasing in number and scale, it could also force those applications to use other methods and modalities, such as behavioral biometrics, to supplement voice recognition.

Article Topics

 |   | 

Latest Biometrics News


Age assurance tech is ready now, and international standards are on their way

The Global Age Assurance Standards Summit has wrapped up, culminating in a set of assertions, a seven-point call-to-action and four…


NIST finds biometric age estimation effective in first benchmark, coming soon

The U.S. National Institute of Standards and Technology presented a preview of its assessment of facial age estimation with selfie…


Maryland bill on police use of facial recognition is ‘strongest law in the nation’

Maryland has passed one of the more stringent laws governing the use of facial recognition technology by law enforcement in…


Immigrant and civil rights groups urge govt to ban own use of FRT, limit private use

Rights groups continue to call on the U.S. government to limit governmental use of facial recognition technology. Digital rights group…


Kenya raises issuance targets for digital IDs and passports

Everything being equal, Kenya plans to issue at least three million digital national IDs and one million biometric passports before…


IOM and Japan back biometrics at Sri Lanka ports of entry

Biometric technology use continues to grow at airports around the world. Air transport industry IT provider SITA predicts that by…


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Read From This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events