MIT researchers developing automated tool to ‘de-bias’ AI training data
A team at MIT’s Computer Science & Artificial Intelligence Lab (CSAIL) is developing a solution which features an algorithm that can automatically ‘de-bias’ data by resampling it to improve its balance, according to an announcement.
The algorithm can identify and minimize hidden biases by learning both a specific task, such as face detection, and the underlying structure of the training data. It was 60 percent effective in decreasing ‘categorical bias’ compared to state-of-the-art facial detection models, in testing on the same Gender Shades facial image data set developed last year by MIT Media Lab researchers, according to the announcement. The algorithm can also be applied to other AI applications.
A significant difference between the solution in development and many existing approaches to the field is that it is entirely automated, and does not require human input to define the bias being targeted. This makes it potentially of particular use for larger data sets.
“Facial classification in particular is a technology that’s often seen as ‘solved,’ even as it’s become clear that the datasets being used often aren’t properly vetted,” says PhD student Alexander Amini. “Rectifying these issues is especially important as we start to see these kinds of algorithms being used in security, law enforcement and other domains.”
Amini and PhD student Ava Soleimany co-wrote the paper describing the system with graduate student Wilko Schwarting and MIT professors Sangeeta Bhatia and Daniela Rus, and Amini was co-lead author on a related paper that was presented this week at the Conference on Artificial Intelligence, Ethics and Society (AIES).
IBM has launched a data set architected to yield insights into the science behind bias, such as what constitutes sufficient diversity.