Microsoft’s Project Oxford offers facial, image and speech-recognition APIs
Earlier this week, Microsoft quietly released a handful of new machine-learning APIs in beta under its Project Oxford program, while its How-Old.net demo for the service went viral a day after, according to a report by TechCrunch.
How-Old.net demonstrates how developers can upload photos of faces and the system automatically deciphers the age of the person in the photo.
As TechCrunch explains, the website “works reasonably well” with a “fair number of mistakes” and uses some of the new developer services offered under Project Oxford.
The new APIs enable developers to integrate face detection and recognition capabilities into their apps, while the service will attempt to calculate the user’s age and send the information to developers.
Multiple divisions within Microsoft collaborated to develop Oxford and the age-detection project, according to Ryan Galgon, a senior program manager on the Oxford project.
Additionally, the API offers face detection capabilities in images, face verification to determine whether two faces are, in fact, the same individual, and the ability to find similar-looking faces.
The API also includes speech recognition capabilities, which will soon be able to help developers to better under their user’s intent. The project also features a vision API for automatically categorizing images and creating smart image crops that always put the subject into the center of the cropped images.
Offer as a public beta, Microsoft will also add a fourth API that enables developers to integrate custom language understanding capabilities into their applications.
The Speech API feature speech-recognition services for speech-to-text conversion, a text-to-speech service that converts text into audio, and intent recognition that tries to understand the speaker’s intent, which is driven by the project’s Language Understanding Intelligent Service.
The image API enables developers to categorize images for the purpose of filtering out adult content or automatically applying tags to images or organizing them into clusters.
The API also offers optical character recognition capabilities, enabling developers to crop images automatically by determining the important aspects in an image and maintaining those components in the center of the photo as you crop it.
The service is currently free to use but Microsoft will eventually charge users for access, although the timeframe for this is still unclear.
Microsoft is currently offering a demo version that enables anyone to try out the service.