AI Talk: HIPAA, privacy, online harms and ancient texts

April 12, 2019 / By V. “Juggy” Jagannathan, PhD

This week’s AI Talk…

HIPAA-compliant Alexa, Google

I saw an article in Geek Wire about Alexa becoming Health Insurance Portability and Accountability (HIPAA) privacy rule compliant. It indeed has, as discussed by this blog post on the Amazon Alexa developer website. Amazon is announcing six new Alexa skills built with help from healthcare providers, payers, pharmacy service organization and others. These skills (within a specific organization) allow patients to check, for instance, status of prescription orders, check on their wellness goals, communicate to providers progress on their recovery from surgery, schedule an appointment with the nearest urgent care center, query for their latest lab results and more! Though this only applies to specific entities currently, the writing is in the wall—what we are seeing is simply the tip of the iceberg.

On a similar note, Google this week has released a beta version of their Cloud Healthcare API that is HIPAA compliant (partners must sign Business Associate Agreements (BAA)). This API support includes Fast Healthcare Interoperability Resources (FHIR) interfaces. Simply put, FHIR resources are packets of standardized information about a patient. The Google API also supports de-identification of data inside FHIR resources. One more sign the big tech firms are taking health care very seriously.

Privacy concerns – Using AI in healthcare

I came across a blog about data privacy and ethics last week. The blog referred to a JAMA article where a multi-institute collaboration of researchers concluded that it is possible to re-identify and compromise personal health information using de-identified step data collected by a variety of smart watches. That is an interesting claim, so I downloaded the paper to read what is going on. First the assumptions: Assume an ACO has access to all kinds of demographic data and fitness data. They train a model that predicts patient records using step data. They release this model. Why? Because they can! Technically the model (which is a bunch of numbers) has no PHI. Then, they release the de-identified step data—just the data—which has no PHI either. Now, an employer for instance can re-identify with 90+ percent accuracy the patient record number based on the de-identified step data and the released model. Now, this is a controversial situation and underscores the need to consider what kind of data is being publicly released. But the issues raised go beyond machine learned models to other potentially unregulated forms of privacy encroachments. Facebook monitoring posts for suicide prevention without patient consent, 23andMe collecting genome data for all kinds of purposes with ambiguous patient consents are examples. The European Union (EU) General Data Protection Regulation (GDPR) has some good privacy provisions, but it is not a panacea either. After two decades of HIPAA, our standard bearer for all thing’s privacy, perhaps the law needs some freshening up.

UK “online harms” trial balloon

The UK Department of Digital, Culture, Media and Sport has released a policy white paper to address what they have dubbed “online harms.” A blueprint to address unsafe use of the internet—broadly defined to include hate speech, misinformation, terrorist propaganda, intimidation, violent content and other speech. This 102-page white paper lays the ground work for identifying such content and a regulatory model of levying fines and even barring companies if found negligent in their duties. It seems to put companies such as Facebook squarely in its cross hairs. It will be interesting to see how this model progresses.

What do ancient texts and mammograms have in common?

Last week, NPR had an intriguing story about a professor at MIT, Regina Brazilay, who teaches how to use machine learning (ML) to decipher ancient texts. Her research focus did an about-face, however, when she was diagnosed with breast cancer. She decided that ML could be used to improve the state of reading and diagnosing cancer using mammograms. Collaborating with Connie Lehman, a Harvard University radiologist, the research focused on using deep learning to solve one of the fundamental tasks in reading a mammogram—determining breast density. Though their research is not ready for primetime, the researchers are optimistic that in a few years these systems will read a mammogram as well as a radiologist!  They are also working on several other tasks which can be addressed using ML. However, this is an industry which has seen its ups and downs. Radiologists have tried using computer-aided diagnosis (CAD) in the past to help with reading images and later studies showed that they didn’t help. Whatever solutions emerge from the mammogram deep learning research will have to go through a rigorous clinical study before adoption takes off.


My colleague, Dan Walker pointed me to the NPR article on breast mammograms. My friend and former colleague, Chris Scott, pointed me to the HIPAA compliant Google healthcare API.

I am always looking for feedback and if you would like me to cover a story, please let me know. “See something, say something”!  Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Vice President of Research for M*Modal, with four decades of experience in AI and Computer Science research.