Amazon develops technology for removing identifying information from medical images
Amazon has developed a way to de-identify medical images to assist healthcare professionals with meeting their HIPAA requirements by applying the company’s machine learning service Rekognition to detect and extract text included in images.
AWS Senior Healthcare Solutions Architect James Wiggins says in a blog post that medical images often contain Protected Health Information (PHI) which must be removed to comply with regulatory requirements. Removing this information, however, has historically required images to be manually reviewed and edited, making it time-consuming and expensive to de-identify large datasets.
Using Amazon Rekognition to extract text from images and Natural Language Processing (NLP) service Amazon Comprehend Medical to identify PHI, along with some Python code, private information can be redacted quickly, and inexpensively, according to Wiggins. An example of a de-identification system architecture is provided, using the Jupyter Notebooks feature of end-to-end machine learning platform Amazon SageMaker to produce redacted images.
Redacting personal information from data in storage may be an emerging market, with the Secure Redact platform launching earlier this month to enable organizations to redact face biometric data to comply with regulations such as GDPR and the California Consumer Privacy Act.