KSV granted patent on voice biometrics in various recording conditions by USPTO
A patent filing from Kaizen Secure Voiz (KSV) has revealed the development of a system to build in tolerance for different conditions of audio signal capture to voice biometrics models for improved performance.
The U.S Patent and Trademark Office (USPTO) has published KSV’s filing for intellectual property protection for ‘Speaker recognition using domain independent embedding’, which describes augmenting raw speech signals with acoustic representations, and then determining “a plurality of Mel frequency cepstral coefficients for each of the plurality of augmented speech signals.” Domain-dependents transformations would then be used to generate acoustic vectors from the MFCCs, which are stacked and processed by a neural network for biometric speaker recognition.
Mel frequency cepstral coefficients, or MFCCs, can be understood roughly as representations of features extracted from audio signals.
The invention is intended to address the differences in conditions between training speech recognition systems and biometric testing or production.
Domain adaptation and feature normalization are the two established approaches to dealing with varying conditions of voice biometric data collection, but KSV’s proposal is based on “Domain Independent Embeddings,” or DIEs, which the inventors say is a superior form of feature transformation to the existing known art of i-vector or x-vector embeddings. DIEs are obtained from a biometric training dataset with multiple speakers to train the model.
The patent is the second for KSV.
KSV, which has offices in New Jersey, U.S. and Chennai, India, was recently honored for its frictionless customer authentication with voice biometrics at the DX Awards 2021.