Voice biometrics software outperforms humans in courtroom setting
Forensic voice-comparison software based on speaker recognition algorithms has outperformed all listeners in a test conducted by Aston University researchers.
The findings of the biometrics research, contained in a paper titled ‘Speaker identification in courtroom contexts — Part I’ were recently published in the journal Forensic Science International.
The team behind the paper pitted 226 listeners against forensic voice biometrics code — the goal: transcribing a recording in two contexts. The first was a telephone call with background office noise and the second was a recording in which a police officer interviewed a suspect in an echoey room with background ventilation noise.
Some listeners were familiar with the language and accent spoken in the recordings, some were familiar with the language but were less familiar with the accent and others were less familiar with the language.
The research also reflects different courtroom contexts. In the phone call, subjects made judgments based only on listening, and in another, they formed judgments based on listening to the recordings and considering the likelihood-ratio values by the forensic voice comparison algorithm.
According to the scientists, the software performed better than the listeners.
“Past experiences where we have successfully recognized familiar speakers, such as family members or friends, can lead us to believe that we are better at identifying unfamiliar voices than we really are,” says contributing author Kristy Martire, with the School of Psychology at the University of New South Wales.
“This study shows that whatever ability a listener may have in recognizing familiar speakers, their ability to identify unfamiliar speakers is unlikely to be better than a forensic voice comparison system.”
This is important because expert testimony is only admissible in common law if it will assist the trier of fact in making a decision that they would not be able to make unaided.
“A few years ago, when I was testifying in a court case, I was asked by a lawyer why the judge couldn’t just listen to the recordings and make a decision. Wouldn’t the judge do better than the forensic voice comparison system that I had used?” says corresponding author Geoffrey Stewart Morrison, director of the Forensic Data Science Laboratory at Aston University.
Morrison says he expected the algorithm to best some subjects, “but I was surprised when it actually performed better than all of them. I’m happy that we now have such a clear answer to the question asked by the lawyer.”
The judge or jury’s speaker identification was less accurate than the forensic scientist’s forensic voice biometrics, which would argue for admitting the system’s results in court.
“Unequivocal scientific findings are that identification of unfamiliar speakers by listeners is unexpectedly difficult and much more error-prone than judges and others have appreciated,” warns contributing author Gary Edmond, a professor at the School of Law in University of New South Wales.
“We should not encourage or enable non-experts, including judges and jurors, to engage in unduly error-prone speaker identification. Instead, we should seek the services of real experts: specialist forensic scientists who employ empirically validated and demonstrably reliable forensic voice comparison systems.”
The research team behind the paper included forensic data scientists, as well as legal scholars, experimental psychologists and phoneticians based in the United Kingdom, Australia and Chile.