Microsoft contractor reveals Skype human annotation of user conversations

Microsoft contractors working on Skype’s translation function have been discovered to be listening to recordings of user conversations, in the latest artificial intelligence system annotation scandal, after Vice Motherboard obtained audio recordings from an anonymous Microsoft contractor.
Apple, Google, and Amazon recently changed their policies for human review of user conversations in response to the growing controversy of AI training practices.
Skype does note on its website that audio of phone calls translated for users may be analyzed to improve the service, though it does not explicitly note this analysis may be done by humans. Motherboard also reports that interactions with virtual assistant Cortana are reviewed by Microsoft contractors.
The source, who is under a non-disclosure agreement, was granted anonymity to speak more candidly about internal Microsoft practices. While the samples obtained by Motherboard are typically between about five and ten seconds, the source says some are longer. The contractor also described hearing personal conversations and search requests, as well as personal information like user addresses, and expressed surprise that the data is more carefully controlled.
“The fact that I can even share some of this with you shows how lax things are in terms of protecting user data,” the contractor who provided the files told Motherboard.
The company says steps are taken to remove personal information, and that the samples are only available to contractors through a secure online portal.
Privacy Matters activist Pat Walshe told Motherboard said that he does not believe the Skype Translator FAQ, which describes the review process, is transparent, and that the whole area requires regulatory review.
“Microsoft collects voice data to provide and improve voice-enabled services like search, voice commands, dictation or translation services. We strive to be transparent about our collection and use of voice data to ensure customers can make informed choices about when and how their voice data is used. Microsoft gets customers’ permission before collecting and using their voice data,” the company told Motherboard in a statement.
“We also put in place several procedures designed to prioritize users’ privacy before sharing this data with our vendors, including de-identifying data, requiring non-disclosure agreements with vendors and their employees, and requiring that vendors meet the high privacy standards set out in European law. We continue to review the way we handle voice data to ensure we make options as clear as possible to customers and provide strong privacy protections.”
With AI systems and natural language processing gaining in popularity, people are going to have to choose whether they want accurate systems enough to put up with these kinds of annotation practices.
Article Topics
artificial intelligence | biometric data | biometric dataset | biometrics | data collection | data protection | privacy | speech recognition | training | voice
Comments