Patient privacy and the use of health data for research

Jan. 20, 2020 / By Senthil Nachimuthu, MD, PhD

Privacy and analytics were two of the predominant themes in the American Medical Informatics Association (AMIA) 2019 Annual Symposium in Washington, D.C. last year. Many speakers and attendees unsurprisingly discussed the media attention following the Wall Street Journal article about Project Nightingale (a partnership between Google and Ascension) published the week before the conference. In addition, the speakers discussed the privacy issues surrounding mobile health applications and consumer-generated health data, which fall outside the boundaries of the Health Insurance Portability and Accountability Act 1996 (HIPAA).

A study by Sunyaev et al. (JAMIA 2015) states that a majority of mobile health applications do not have privacy policies, and that understanding privacy policies requires a college-senior-level reading ability, which is a far cry from the metaphorical fourth-grade-level reading ability that many authors of such end user license agreements (EULAs) claim to target but seldom accomplish. Creative Commons license proves how a complex legal agreement can be distilled into a consumer-friendly form, though it has a companion full version that would apparently meet legal requirements. Going by common sense (I’m not an expert in law), it is a huge missed opportunity by the creators of mobile applications to not use a step-by-step wizard style interface that can present the salient points one sentence at a time to get both device permissions and data permissions.

As a researcher who analyzes healthcare data to develop predictive models, I would like the two themes of privacy and analytics to be addressed together to identify ways in which patient privacy can be protected while making data available for medical research simultaneously. I understand why AMIA Public Policy principles state these two important points together:

  • Data sharing among stakeholders is critical to: advance scientific discovery; improve benefit / risk assessments; conduct comparative effectiveness research; improve patient safety; and promote biomedical research rigor, transparency, and reproducibility.
  • Data sharing should preserve and protect patient and consumer privacy and autonomy.

I have also been following the ONC Notice of Proposed Rulemaking (NPRM) to Improve the Interoperability of Health Information from March 2019. The NPRM defines information blocking as a practice that is likely to interfere with, prevent or materially discourage access, exchange or use of electronic health information (EHI), except as required by law or specified by the Secretary of Health and Human Services (HHS) as a reasonable and necessary activity. The NPRM calls for EHI to be made available via standard Application Program Interfaces (APIs), and defines seven exceptions to the release of information, such as protecting patients from harm, protecting their privacy and security, recovery of reasonable costs, etc. This contrasts with HIPAA, which specifies the three conditions that permit the release of protected health information (PHI).

Furthermore, following on the heels of the European Union’s General Data Protection Regulation 2016 (GDPR), several states in the U.S. have started implementing their own privacy regulations. California Consumer Privacy Act 2018 (CCPA) goes into effect on January 1, 2020. Maine and Nevada have already implemented their privacy regulations. Many other states such as Hawaii, Illinois, Massachusetts, Minnesota, New Jersey, New York, Pennsylvania, Rhode Island and Washington are working on their state privacy regulations. Considering individuals often travel across state lines and that health information is more frequently transferred across state lines, the administrative burden of complying with multiple state regulations is daunting.

Based on the above commentary, here are my recommendations to protect patient privacy while promoting data sharing and medical research. They are two sides of the same coin and medical research cannot happen without gaining and protecting patients’ and clinicians’ trust.

The U.S. government should facilitate privacy protections by leapfrogging the remaining 47 states and creating a federal privacy act that harmonizes various aspects of privacy regulations and serves as the foundation on which each state can build. EU GDPR pioneered a similar model at an international level, therefore it is a good model to follow at intra-national levels for other countries that want to catch up quickly, obviating the need to reinvent the wheel.

The information blocking rule allows patients to be in control of their own data and permits them to share their data with third-party service providers by giving them permissions to query a health information system on their behalf. Government regulation should be supplemented by organizational responsibility to uphold the trust of patients and protect against inappropriate use or access by anyone. Strong regulatory disincentives should be the last line of defense rather than the minimum organizations should do to protect patient privacy.

Explaining device and data permissions in a step-by-step, wizard-style, user-friendly manner using simple sentences to leverage the interactive mobile interfaces (rather than hiding 20-page EULA and Privacy Policy documents behind innocuous links) is a necessary first step. This is one place where even healthcare providers (who are often complained about and made fun of for using medical jargon with patients) do better than their technology counterparts.

Effective use of clinical data for research requires reliable de-identification, even though de-identified data can be re-identified and are not truly anonymous. Many studies have proved that it is feasible to de-identify structured clinical data, but unstructured notes written by physicians and nurses are harder to de-identify. However, I have come across effective approaches to de-identify free text clinical notes as well, and I intend to study them closely for my own research. Stronger regulatory protections against re-identification and linking to individually identifiable data (for purposes such as online advertising and marketing, among others) will help to promote development, use and trust in developing these technologies. It will also promote trust towards aggregating, de-identifying and using large amounts of data for research. Such regulatory protections are essential when data aggregators link healthcare data from hospitals and insurance companies, social determinants of health data from multiple personal data collectors outside health care, and consumer generated health data from wearable and connected devices.

Implementing the right to be forgotten and the right to deletion to the extent possible will help gain patient’s and health care professional’s trust. However, once someone’s data are de-identified and added to a research data set, it is extremely difficult or impossible to trace and delete them, and would be detrimental to the repeatability and reproducibility of results, which affects the integrity of research. Health data has an impact on the privacy of a patient’s blood relatives as well as non-blood relatives, as can be seen in cases of DNA sequences and biological samples, especially with the discovery of new medical knowledge that can be applied retroactively to past data and specimens. Privacy regulations and practices need to take into account the nuances and peculiarities of healthcare privacy and address them proactively.

In addition, when organizations enter into healthcare data sharing agreements, transparency about the data sharing and data protection agreement while retaining the confidentiality of the business agreement, will help gain the trust of patients whose data are being exchanged and used, and of doctors and nurses who collect these data for treating patients while honoring their oaths to protect their patients’ privacy. This would allay any fears and doubts similar to those we saw in media articles around Project Nightingale.

GDPR requires consumers to opt-in, whereas CCPA requires consumers be able to opt-out for data collection and use. However, the notice of privacy practices that I receive from my healthcare provider organization does not include an easy way to opt-out of specific uses of my healthcare data, and I presume this to be the case all over the U.S. currently. Many researchers opine that giving patients this choice will reduce the data available for research. Conversely, how can we as healthcare, research, technology and legal professionals uphold patient privacy standards so that they willingly opt-in to allow their data to be used for research, similar to how they allow medical and nursing students to interview and examine them when they go to a teaching hospital? I welcome your thoughts about protecting patient privacy for healthcare research – please comment below or feel free to contact me directly.

Senthil K. Nachimuthu, MD, PhD, FAMIA, is the Director of Health Informatics, Data and Analytics with 3M Health Information Systems.