Inside Angle
From 3M Health Information Systems
AI talk: Digital mask
Digital mask
I saw a fascinating new research study published this week. The theme of the research is how to protect patient’s facial images and video in the context of using digital health technologies – electronic health records (EHRs) and telemedicine. This work is a joint effort between researchers from a number of Chinese universities and the University of Cambridge in the UK.
Deidentification of the clinical record, where personally identifiable health information is removed, is an established practice. HIPAA regulations stipulate how one can deidentify such records in great detail, where the focus is on dates, medical record numbers, addresses, etc. It also states that full face images and any biometric data needs to be removed. The question the researchers in this study have addressed is, how to protect patient privacy while allowing the clinical diagnosis and treatment using facial images and video.
Their solution is to use a 3D reconstruction of the facial image using deep learning techniques. The reconstruction obfuscates key features of the face that can fool a face recognition system, while preserving characteristics important to the diagnoses and treatment of various eye conditions. Some of the conditions include estropia (misalignment of eyes), ptosis (drooping of eyelids), poor fixation, frequent blinking, etc. To validate the technology, they ran some studies with the technology in hospitals in the UK and were able to show it works well. The technology also has an application in telemedicine settings where a digital mask obscures the actual patient’s face. I am reminded of interviews on CBS “60 Minutes” where sometimes they employ disguise and voice morphing to protect witnesses.
Digital masking is a new frontier in digital deidentification and if proven effective, will promote the sharing of more content for research without compromising patient privacy.
Unsupervised AI to read X-rays
The latest issue of MIT Technology Review carried a blog about a new way to train detection of diseases from X-rays. The research reported was a joint endeavor between Stanford and Harvard Medical School. Essentially the core idea behind this work is that one can leverage radiology reports associated with X-rays for training a deep learning model. The approach obviates the need for a labeled dataset, where one has to manually associate curated labels with each image (an expensive process). The approach is not new and is like the CLIP effort from OpenAI. In CLIP, which is an acronym for contrastive language-image pre-training, each image is associated with a caption such as “photo of a dog” or “photo of a cat” and the model is trained on such a dataset. Such pairing is widely available, and the trained model can predict the classification of other images the model is prompted with.
In a similar vein, each X-ray is paired with the impression section of the radiology report. The impression section carries the concise description of what is found on the image. During the training process, a vision transformer is used to read the X-ray image and convert it into a representation that captures the essence of the image. During training, the image representation is associated with the impression representation obtained using a transformer. Then, given an X-ray image and a prompt for specific pathology, the model outputs whether the pathology is present in the image or not. In a range of pathologies, the model did as well as trained radiologists. This is one of the first studies to show that one can utilize the natural language reports to train a classifier for X-rays. The methodology can be readily extended to interpret other image modalities. It will be interesting to see where the trajectory of this effort leads.
I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.
“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.