FB pixel

Deepgram’s Nova-3 voice AI for enterprise enables linguistic twiddling

Automated STT transcription model gets persnickety with specialized language
Deepgram’s Nova-3 voice AI for enterprise enables linguistic twiddling
 

San Francisco’s Deepgram has announced the launch of its new speech-to-text (STT) real-time transcription model, Nova-3, which a release calls “the industry’s first voice AI model to enable self-serve customization, allowing users to fine-tune the model for specialized domains without requiring deep expertise in machine learning.”

In allowing for easy, user-friendly customization, says the firm, Nova-3’s speech recognition “pushes the boundaries of AI-driven transcription, offering unmatched accuracy in challenging audio environments while offering flexible, self-service customization to tailor results for industry-specific needs.”

The Nova-3 product improves on its predecessor in terms of accuracy and performance in adverse acoustic conditions found in real-world scenarios, such as in air traffic control, drive-thrus and call centers. With what Deepgram calls “domain-specific precision,” Nova-3 leverages an “advanced latent space architecture to encode complex speech patterns into a highly efficient representation.”

That means that even in noisy environments, transcription can be rendered with precision and accuracy, because Nova-3’s machine learning algorithm is able to compress and distill speech recognition data more efficiently.

The model’s linguistic chops boast real-time multilingual support and industry-specific language recognition to recognize specialized terminology in fields like medical and legal transcription. It provides enhanced contextual information and analysis. Its data handling capabilities are refined with precise numerical measurement and real-time redaction of sensitive information for compliance and data privacy.

And with Keyterm Prompting, developers can improve accuracy by optimizing up to 100 key terms, making deployment more efficient and cost-effective.

“Nova-3 represents a significant leap forward, extending the frontier of real-time accuracy while once again bending the cost curve – two critical components for enterprise speech-to-speech use cases,” says Deepgram CEO Scott Stephenson. “By integrating advanced architectural enhancements and extensive training across diverse datasets, we’ve developed a model that not only meets but exceeds the evolving needs of our clients across various industries.”

Deepgram’s platform offers text-to-speech (TTS) and full speech-to-speech (STS) capabilities in a suite of cloud or self-hosted APIs. Per the release, its high-performance runtime includes “powerful automation and data capabilities – such as synthetic data generation and model curation – along with model hot-swapping and robust integrations, empowering developers to efficiently build and scale voice-enabled applications.”

Deepgram backs up its PR with evaluatory bona-fides in benchmarking for transcription accuracy. “Nova-3 outperforms competitors in both batch and streaming use cases, with consistently lower Word Error Rates (WER) that drive superior performance in real-world audio environments, including multilingual scenarios,” says the release.

Nova-3’s multilingual feature, which is designed to allow firms to scale globally, outperforms OpenAI’s Whisper in tests across seven languages.

Related Posts

Article Topics

 |   | 

Latest Biometrics News

 

Face biometrics use cases outnumbered only by important considerations

With face biometrics now used regularly in many different sectors and areas of life, stakeholders are asking questions about a…

 

Biometric Update Podcast explores identification at scale using browser fingerprinting

“Browser fingerprinting is this idea that modern browsers are so complex.” So says Valentin Vasilyev, Chief Technology Officer of Fingerprint,…

 

Passkeys now pervasive but passwords persist in enterprise authentication

Passkeys are here; now about those passwords. Specifically, passkeys are now prevalent in the enterprise, the FIDO Alliance says, with…

 

Pornhub returns to UK, but only for iOS users who verify age with Apple

In the UK, “wanker” is not typically a term of endearment. However, the case may be different for Pornhub, which…

 

Europol operated ‘shadow’ IT systems without data safeguards: Report

Europol has operated secret data analysis platforms containing large amounts of personal information, such as identity documents, without the security…

 

EU pushes AI Act deadlines for high-risk systems, including biometrics

The EU has reached a provisional agreement on changes to the AI Act that postpone rules on high-risk AI systems,…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events