New generation of Phonexia DNN voice biometric engine offers major performance improvements
Phonexia has announced the launch of the fourth generation of its voice biometric software, which cuts the time required for identification by a third while scoring accuracy above 99 percent in testing against the NIST SRE dataset.
The company says its Deep Embeddings for Speaker Identification, which is now available in production, is the first commercially available voice biometrics engine based exclusively on deep neural network (DNN). The fourth generation combines DNNs and more robust speaker models for major performance improvements over the third generation. The last generation, which was launched a year ago, scored an Equal Error Rate (EER) of 1.24 percent against the NIST SRE dataset, which Phonexia says made it among the most accurate voice biometric engines on the market. The new version has been measured with a 0.96 percent error rate. On a client dataset, Deep Embeddings third generation’s EER was 5.61 percent, and the new version’s EER was 2.35 percent.
Additionally, the time necessary for identification has been reduced from 10 to 7 seconds, while enrollment time has been reduced from 35 to 20 seconds.
The new technology also provides improved performance for identification through a different channel than the enrollment was captured in, and for cross-language scenarios, the company says. The speed of processing has improved from 5 times faster than real time to 20 times faster than real time, even as RAM consumption has fallen from 2.07 GB to 0.08 GB.
Phonexia plans to market Deep Embeddings for use in criminal investigations, financial services, virtual personal assistants, smart homes, IoT, automotive, and industrial applications, and embedded devices.
The number of businesses using voice technology to interact with customers is forecast by Pindrop to reach 85 percent this year.