Scanning for emotions coming out of the AI shadows
Emotion recognition might be the most concerning biometrics-based tool for some but that is not stopping research into it.
The amount of R&D going into the automated scanning for emotions can only be inferred because much of it is done by companies and government surveillance agencies, which do not reliably report their efforts.
For the most part, AI vendors mention emotion recognition among other capabilities built into biometric recognition product suites. Agencies sometimes tip their hands with requests for proposals.
But this month, the technology was the focus of a New York Times feature story, two published research papers and analysis by a trade-publication.
The Times did a roundup of related stats (20 percent of U.S. adults suffered mental illness in 2020) and research on AI voice analysis.
In the piece, the Harvard Medical School assistant professor Kate Bentley goes on the record saying voice analysis is useful for mental health care workers. It hears hints of illness that even a trained practitioner could miss.
“There’s a lot of excitement,” Bentley is quoted saying, about using AI to pick up objective clues.
Not as impressed as the vendors no doubt would have liked, the writer judges voice analysis for diagnosing (or even just indicating) affective disorders as “promising but unproven.”
Going deeper, a large team of researchers in Poland have written about their Emognition dataset, created to train algorithms capable of continuous affect assessments of patients with affective disorders and children suspected of experiencing autism spectrum disorder.
Scientists from Wroclaw University of Science and Technology, Adam Mickiewicz University see more uses for the specially trained algorithms, too.
Broad use could have a measurable impact on the mental health of large populations if no other reason than more people could theoretically learn why they feel unwell and act. Human-computer interaction could become easier.
Perhaps less welcome for some might be Emognition-trained algorithms that aid in content and shopping recommendations.
The dataset has the upper-body recordings of physiological signals of 43 participants who “watched validated emotionally arousing film clips” which elicit nine emotions: sadness, fear, anger, disgust, liking, enthusiasm, surprise, awe and amusement.
The subjects were strapped with three wearables to record physiological responses, and a camera recorded their upper bodies. They completed two self-reports after each clip as well.
A second research project started with the premise that emotional intelligence is an important development for digital assistants such as Apple’s Siri and Amazon’s Alexa.
The work was done by scientists at the Japan Advanced Institute of Science and Technology and the Institute for Scientific and Industrial Research at the University of Osaka, according to reporting by publisher Israel’s Homeland Security (also referred to as iHLS).
They gathered multimodal sentiment and physiological signal analysis to get at biological signs of emotion that people can sometimes hide from their faces. Researchers used the Hazumi1911 dataset, which adds together speech recognition, voice color sensors, facial expressions, posture recognition and physiological response recognition.
More emotion-lined information fed into human-machine interactions, the article contends, can strengthen connections between the two sides.
It also make people better public speakers, according to reporting by tech publisher Unite.AI.
One of the problems plaguing people who speak before groups, particularly over video connections, apparently is the paucity of feedback from audiences.
Past attempts to provide that feedback have included electroencephalography and systems that monitor heart rates. Knowing an audience is engaged increases a speaker’s confidence, as illustrated by empty filler words sprinkled into presentations.
More recent is research from the University of Tokyo and Carnegie Mellon University that integrates with Zoom and other videoconferencing services, according to Unite.AI.
In experiments, the scientists deployed gaze and pose estimation software and webcams. The system tracked a would-be audience member’s nods and eye movements. That information can be fed live to a speaker who, presumably, could change something to regain the room’s attention.