Google’s TensorFlow gets a new test for training data leaks
Google late last month debuted experimental tests for its TensorFlow Privacy library designed to reduce the degree to which machine learning models leak identifiable personal information in training data sets, such as for biometric facial recognition.
The test module enables developers to “assess the privacy properties of their classification models,” according to Google. The testing tool is known as a membership inference attack.
Obvious applications for the technique include facial recognition and health care.
This amounts to a second try for TensorFlow Privacy, which was introduced last year to address the “emerging topic” of privacy in machine learning, Google said.
When it debuted, the open-source software library enabled the training of models with differential privacy.
A Tech Xplore article explaining differential privacy said that the concept involves patterns of groups in data sets that are publicly shared but that protect links to individuals in the groups. The links are protected by adding noise to the data set, obscuring individual data records.
The noise also can degrade performance, however.
A membership inference attack is a cost-effective way to “establish a score revealing how vulnerable a model is to leaked information,” according to Google.
It predicts if a datum was used in training. If someone can accurately predict that, “they will likely succeed” in determining if a datum was used in a training set. From there, information shields can be compromised.
The test vulnerability score can help developers “identify more architectures that incorporate privacy design principles” that best protect sensitive data.