Six keys to helping docs wade through data lakes
Just a few short years ago, doctors prescribing treatments were limited to information culled from their training and experience, available published research and drug sales representatives.
By contrast, today they have near-instant access to a veritable “lake” of unstructured information (and disinformation) from studies, trials, medical histories, applications and even social media—much of it constantly provided in real-time.
While many have heralded this information smorgasbord as a “revolution,” the real change is in the potential life-saving and time-saving value this aggregate data can provide. A significant number of practitioners, however, actually feel inundated by sheer size of the lake, and they end up spending a lot of time trying to derive meaningful guidance from it. They often find themselves foundering, not floating, in this lake of data.
Wrong assumptions or incorrect conclusions about patients can be dangerous and indicative of a more systemic issue—healthcare providers have no way of normalizing the data they have about an individual (for example, gender, age, ethnicity, history, fitness sensor data and more) upon which they are drawing conclusions. The vast oceans of data available currently are persuasive by their sheer weight alone, when ironically they should be suspect for that very same reason.
Life sciences and healthcare organizations are making huge investments to gather data from a variety of sources—the Internet of Things, social media, clinical trials, electronic healthcare records and directly from patients themselves. Nevertheless, as with many advances, these investments typically do not pay off until they deliver faster, better, more informed treatments. In other words, value will not be realized until data is turned into outcomes.
The resolution for having too much data is not to create less data. What’s needed is the development of a platform that provides better analyses and application of the data. The multiple capabilities of such a platform are highlighted below.
Normalizing disparate devices
When clinical scientists conduct a study, they start with data definitions and models based on long-standing standards and follow these models assiduously. However adherence to such models across the many different devices used to collect the data remains an issue. Few organizations have considered this and said, “We should conform to what has gone before.” For example, smart watches provide data in one format, Fitbits another, and so on.
Normalizing disparate data
Social media data is an example of complex and enigmatic data. How can seemingly unrelated postings and comments be understood and correlated? It is very difficult to discern the source, legitimacy and reliability of information on social media. Before the technology platform can deliver reliable data, it needs to solve for discrepancies and uncertainties ensuring incoming data can be properly mapped to existing data.
Breaking down data source barriers
An acquaintance was caring for an individual whose heart rate and oxygen levels were undergoing continuous monitoring by different devices, respectively. The patient was at risk if either value reached dangerous levels. The challenge for the caretaker was they needed instant notification if either device detected a trend toward danger. However, neither device could communicate with the other nor provide compatible data sets to the caretaker, leaving the caretaker uncertain whether they could determine the likelihood the patient was at risk.
Regardless of the different device designs, the data structures they use, and the various ways data is labeled, the technology platform must present the data in a way that meets the needs of the caretaker.
Complying with privacy regulations and protocols
The shift to the online world has made it increasingly difficult to protect personally identifiable information. Search tools, identity theft, facial recognition, biometrics–all of these make it increasingly likely information may be exposed, even if accidentally. Technology platforms must be secured so even while machines are learning, they may “recognize” and discriminate specific individuals. All data, including PHI information, must remain protected.
Producing predictive analytics
It’s not enough for this technology platform to draw conclusions simply based on the assembled data. To truly deliver on the data explosion promise, it must also be able to produce useful predictions. For example, one question might be: given one patient’s treatment, what other patients share similar characteristics that might make them candidates for the same treatment? Which treatments are best for a particular patient, based on a specific data set? How does the patient community perceive quality of life under certain treatments?
Accommodating users of different skill levels
Pharmaceutical industry clinicians and data managers are supremely skilled in constructing and employing rigorous data models. Ironically though, they are not as well versed in adding Facebook conversations or Twitter threads into their studies. Social media analysts are skilled in converting the “noise” of social media into usable intelligence. Few are likely to have a deep knowledge of data models or datasets. These two groups have vastly different styles of working, and use different tools and applications. The former are analytical, dealing in concrete facts and strict guidelines. The latter are more free-form, taking unstructured data and trying to code it, while simultaneously developing the rules that will guide their profession. Thus there are distinct technology platform needs for both groups, using skills they have yet filling in for those they lack.
The deepening, widening data lake offers value in direct proportion to the wisdom that goes into understanding the myriad data it contains, how to normalize it, and how to provide the right tools and skill sets for accessing it. When this is accomplished, users will truly reap the rewards of the data explosion, with all parties enabled to not only make wise decisions with desirable results but also avoid unnecessary or redundant experimentation, in turn causing costly delays.