AI Talk: Predicting breast cancer, regulating SaMD, predicting mutations

Feb. 5, 2021 / By V. “Juggy” Jagannathan, PhD

Hospital patient

Predicting Breast Cancer

This week, MIT News featured a good discussion of some fresh new research published in Science Translational Medicine. Adam Yala, an MIT researcher, led this multi-year, multi-site study. The goal of the study? Predicting breast cancer risk, every year for the next five years, using data from mammograms. Of note is that researchers carefully designed the study to include mammograms from a diverse set of races and ethnicities. They basically strived to remove bias by with an extensive and robust representation of minority communities. The study also included cohorts from a Swedish hospital and a hospital in Taiwan.

The researchers used an algorithm that performed well in predicting cancer risk across the different populations. This algorithm, dubbed “Mirai,” is a deep learning algorithm. Four standard views of an individual mammogram are first reduced to a vector representation using an image encoder. These are then combined into one vector (encoding) using the currently popular transformer architecture. They then predict the risk factors that the patient may have relating to age, hormonal factors, genetics and breast density. The idea behind predicting risk factors from the image is to incorporate these into the final prediction of risk for breast cancer. At test time, if the actual risk factors are known, one can use those. If not, use the imputed risk factors in the final risk prediction.

This study can have a significant impact on the course of treatment—particularly to address racial disparities that currently exist, including that Black women are 43 percent more likely to die from breast cancer than white women. It also affords the opportunity to target treatment and additional imaging tests, such as MRI, to high-risk populations.

Regulating AI/ML-based Software as a Medical Device (SaMD)

The FDA, after a detailed analysis with feedback from the public, has issued details about its plans to regulate AI-based software as a medical device (SaMD). The details of the actual plan can be seen here (4). The published five-prong approach is interesting and quite reasonable. Here is the gist:

  1. Submissions for FDA approval must include a Predetermined Change Control Plan. i.e., how is the device’s learnt behavior going to change over time and how does the vendor ensure the safety and efficacy of the device as the underlying algorithm evolves.
  2. The FDA will attempt to harmonize and promote “Good Machine Learning Practice.” This is a good idea to ensure proper training data sets are used, avoidance of bias, etc.
  3. Promote labeling transparency to end users to ensure the proper usage of the device and interpretation of results.
  4. Support regulatory science efforts to ensure elimination of bias and robustness of the results (a more focused item than item 2 above).
  5. Monitor real world performance (RWP) to ensure continued correct operation of the device.

These are definitely steps in the right direction. Changing models and evolving solutions are not only relevant for AI-based SaMD, but other fields where FDA regulation is involved. For instance, we know that Moderna is working on a booster shot for COVID-19 to address the mutating new strains of the virus.

Predicting mutations

I saw a reference to interesting new research in AI in Healthcare (5) this week. A team of MIT researchers has developed an approach for predicting how COVID-19 will mutate (6). If researchers can predict ahead of time what variants are likely to occur, then vaccinations can be fine-tuned as well for improved preparation.

I learned a new term reading through this work: “Viral escape,” which denotes viral mutations that dodge the resistance put forth by our antibodies—essentially escaping their grasp. How to solve this prediction problem is also interesting. Researchers use techniques borrowed from the field of natural language processing to predict meaningful mutations to gene sequence variants.

The idea goes like this: Say you have a sentence like “Australian dead in Bali.” Then the sentence mutates to “Aussie dead in Bali.” This mutation of the sentence is pretty close semantically to the original, but what if you mutate the original sentence to “Australian ballet in Bali.” Now the semantics are completely different even though the sentence is still grammatically correct. Researchers use bidirectional long short-term memory (BiLSTMs) to predict which mutations are semantically closest. They also use a constrained semantic change search (CSCS) that predicts which variants are significantly different semantically—a mutation that has the capability to become a candidate for viral escape. Interesting work!

I am always looking for feedback and if you would like me to cover a story, please let me know. “See something, say something!” Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.


Listen to Juggy Jagannathan talk AI on the ACDIS podcast.