YITU develops highly accurate Mandarin speech recognition system
YITU Technology has developed a Mandarin speech recognition system it says provides an unprecedented degree of accuracy, leveraging its artificial intelligence expertise to achieve a character error rate (CER) of 3.71 percent, or 20 percent better than the industry’s previous best result.
The company says CER is equivalent to word error rate in speech recognition technology for English.
YITU developed innovations in data collecting and labeling, and tools like training systems and algorithm models to achieve the high accuracy rate. The technology was tested with the world’s largest dataset of Mandarin speech, AISHELL-2, and scored top marks for accuracy, speed, and transcription ability in multiple scenarios of combined English and Chinese, according to the announcement. The company’s researchers trained and refined the AI system’s capabilities with examples of common scenarios including phone calls, audio programs, and accented speech.
“Performances of currently available technologies on the market are mixed and can only fulfill a few basic features,” says YITU Chief Innovation Officer Lu Hao. “YITU’s goal in starting its own speech recognition research was to address challenges in this promising industry.”
YITU notes that speech recognition technology is widely regarded as the next crucial frontier for AI companies, as it will be the first or primary mode of human-machine interactions as they play an increasing role in people’s lives. The company plans to augment and expand its offerings with speech technology, and to continue investing and researching speech recognition. The global market for speech recognition is forecasted to reach $6.9 billion by 2025.
The facial recognition technology YITU is best known for currently sits atop the leader board in NIST’s Ongoing Facial Recognition Vendor Test (FRVT).