Release of De-Identified Health Data Poses Elevated Risk
Healthcare organizations continue to face challenges in de-identifying health data for secondary uses other than direct patient care, such as data analysis, research, safety measurement, public health, payment, and provider certification. Consequently, organizations may be releasing data with serious risks for re-identification, according to a new survey.
The survey of 271 professionals who handle protected health information, conducted by de-identification technology vendor Privacy Analytics in collaboration with the Electronic Health Information Laboratory, finds that while 62 percent of respondents indicated that their organizations currently release data for secondary purposes, more than two out of three of these organizations lack confidence in their ability to share data safely in order to protect individual privacy. Nonetheless, more than half (56 percent) are planning on increasing the volume of data they share in the next 12 months.
“The question is what is acceptable risk and how do you manage it,” says Khaled El Emam, CEO of Privacy Analytics, who points out that nearly half of survey respondents cited patient re-identification as a key challenge, with concern greatest among those already sharing data. “We’ve seen some very large and complex data sets. And, to de-identify that, you really need some sophisticated techniques. There are good practices for de-identification and there are poor practices for de-identification.”
In the latter category, he reveals that more than 75 percent of those surveyed indicated that they are using one or more approaches that can result in unknown data privacy compliance and increased risk, such as data-sharing agreements, data masking and Safe Harbor methodology.
Under the HIPAA privacy rule, there are two ways to de-identify data. The first is the “Safe Harbor” approach, which permits a covered entity to consider data to be de-identified if it removes 18 types of identifiers and has no actual knowledge that the remaining information could be used to identify an individual, either alone or in combination with other information.
The second is the statistical approach, which permits covered entities to disclose health information in any form provided that a qualified statistical or scientific expert concludes—through the use of accepted analytic techniques—that the risk the information that could be used alone or in combination with other reasonably available information to identify the subject is very small.
“Although Safe Harbor is recommended by regulators, it represents a minimum standard for de-identification that can leave data vulnerable to a breach,” according to Privacy Analytics, which “operationalizes” the statistical approach through software automation. At the same time, de-identified data through data masking—the vendor argues—can lose its value as the identifying factors are removed. When sharing data for secondary use, Privacy Analytics makes the case that it is critical to balance privacy compliance with data utility.
Complicating matters is the fact that there is currently no universal standard for the de-identification of protected health information. However, earlier this year, HITRUST—a healthcare industry stakeholder coalition to improve cybersecurity—created a de-identification framework, offering guidance, standards and controls to better understand the processes of de-identifying data.
According to Privacy Analytics, the HITRUST framework has “incorporated and refined current best practices and regulations so that health organizations have access to essential information regarding information security.”
The vendor’s survey reveals the challenges that healthcare organizations face in implementing good practices for de-identification: low staff knowledge on managing data safely (27 percent), low staff knowledge of data sharing practices and tools (25 percent), cost concerns (24 percent), and lack of organizational policies (23 percent).