Microsoft’s Project Oxford unveils API for speaker recognition
Microsoft recently added voice biometrics to its online suite of software development tools.
The software giant has released an API that allows programmers to leverage speaker recognition algorithms that recognize a human’s voice in audio streams. In effect, the tool allow developers of mobile and PC apps to enable voice recognition capabilities for both speaker verification and speaker identification.
The speaker verification component of the tool can automatically verify and authenticate users from their voice or speech. It is tightly related to authentication scenarios and is often associated with a passphrase. The tool thus leverages a text-dependent approach, which means speakers need to choose a specific passphrase to use during both enrollment and verification phases.
The speaker identification component of the tool can automatically identify a person speaking in a given group of prospective speakers. The input audio is paired against the provided group of speakers, and in case there is a match found, the speaker’s identity is returned. It is text-independent, which means that there are no restrictions on what the speaker says during the enrollment and recognition phases.
This “speaker recognition” API is part of Microsoft’s cloud-based “machine learning platform” called Project Oxford, which also features speech processing and face recognition tools, along with other tools that process video, language and even emotion through face recognition. Some of the APIs include the same technology used in popular Microsoft software products and services.
“Our goal with speaker recognition is to help developers build intelligent authentication mechanisms capable of balancing between convenience and fraud,” stated Ryan Galgon, Senior Program Manager, Microsoft Technology and Research in a Microsoft-published technical blog.
He notes that “Project Oxford is just one instance of a broader class of work the company is pursuing around artificial intelligence, along with a vision for more personal computing experiences and enhanced productivity, aided by systems that increasingly can see, hear, speak, understand and even begin to reason.”
Project Oxford debuted last May at Build 2015, a developers’ conference held in San Francisco. The new project is designed to simplify processes involved with introducing complex and expensive technologies to mobile applications. As a consequence, the tool supports development and distribution of apps for iOS and Android platforms as well as various versions of the Windows OS. APIs within the project are being initially offered free in beta mode, to allow developers to evaluate the quality of the tools.