UNC Health Care looks to big data to reduce readmissions

The use of predictive modeling, machine learning and other analytics methods helps the organization improve scoring of medical risks.


UNC Health Care is a sprawling enterprise with huge amounts of data flowing across six hospitals, including an academic medical center, a cancer center and neurosciences hospital.

But as it starts pushing into the realm of big data, the Chapel Hill, N.C-based health system is just as focused on corralling information in its core line-of-business systems and setting up a foundation of skills and technology to perform advanced analytics.

One of its first big data projects, focused on refining its risk scoring for 30-day hospital readmissions, has validated its years-long effort to build an infrastructure to utilize predictive modeling, machine learning and other sophisticated methods across the enterprise, according to Jason Burke, system vice president and chief analytics officer at UNC Health Care’s Enterprise Analytics and Data Sciences division.

Hospital readmissions have been a chronic problem industrywide; the federal government has in recent years refused to cover the costs of readmissions for a growing number of conditions, believing that many of those readmissions are avoidable and are the result of breakdowns in care management and other clinical care shortcomings. As a result, health systems are focusing on reducing 30-day readmissions by better identifying which patients are at risk to be readmitted and ensuring they are communicating frequently with those patients, as well as delivering follow-up care after they’re discharged from the hospital.

“We have a few big data projects, but before you try to handle large amount of diverse outside data and streaming data—how people now define ‘big data’—you have to deal with the data that’s in your wheelhouse,“ Burke says. “The industry is moving toward an environment where genomics and other types of precision medicine are going to be a competitive requirement. When we looked at what’s coming, we decided that it’s critical that we focus on implementing a learning health systems model.”

Also See: Heart organization building big data platform to aid precision medicine

The learning health systems model, first defined by the Institute of Medicine, calls for alignment of medical research, informatics and incentives at a health system to focus on continuous improvement and innovation. The overarching goal is to use analytics and clinical research to constantly find ways to increase the quality of clinical care, and embed those improvements in systemwide best practices. “That’s not going to happen organically—you have to align a whole bunch of people and processes and technologies,” Burke adds.

The big data readmissions initiative was another step down the path toward having a learning systems model in place. UNC Health Care combined internal EHR information with ZIP Code and person-level socioeconomic data to significantly improve its ability to identify current patients at high risk of being readmitted. Armed with that information, the health system is working on ways to embed that intelligence into its workflows to keep a close eye on high-risk patients and decrease the chances that health problems will occur.

UNC Health Care’s analytics effort is powered by an advanced systems architecture.

The health system decided a few years ago to standard its electronic health records platform by adopting an enterprise EHR from Epic. Since then, it’s adopted a number of Epic modules--including registration and scheduling, health information management, hospital billing and an Epic research application—but has also created a separate enterprise data warehouse environment, called the Carolinas Data Warehouse, that brings together data from its EHR and other core systems such as laboratory, pharmacy and registration.

The warehouse runs on the Netezza platform from IBM. That foundation is structured on IBM’s Unified Data Model, which is a healthcare-specific blueprint of data warehouse design, business terminology and analysis templates. IBM’s InfoSphere DataStage is used for ETL from the health system’s core software as well as for external data feeds, such as claims data.

Sitting atop the IBM foundation is a SAP Business Objects environment that UNC Health Care analysts use for business intelligence and reporting. Beyond that core foundation, UNC Health Care is using a slew of analytics tools to mine its information. “As an academic medical center, we have a lot of researchers getting into the data, so there are a number of cases where SAS and R tools are being used, and we’re starting to see an increasing amount of Python,” Burke says. SAS, R and Python are packages of tools and programming languages that enable users to build statistical models and algorithms.

Also See: More providers using analytics to reduce readmissions

“You have to think about overall strategy when developing the architecture. While there are a lot of solutions being offered by vendors of transaction-based systems, you’re not going to be operating in that type of a homogeneous data environment, where all the data is formatted and standardized according to one vendor’s data modeling specifications. You don’t want to have any restraints on what types of data you can model in your data warehouse.”

UNC Health Care’s readmissions initiative utilized predictive analytics to see whether it could improve its scoring methodology to identify which patients were most at-risk to return to the hospital within 30 days of discharge. With the help of Forecast Health, a Durham, N.C.-based predictive analytics firm, the health system utilized EHR data and ZIP code and person-level socioeconomic data to develop individual risk scores. The risk scores are compiled from numerous variables run though the predictive algorithms to calculate numerical grades, which indicate low, moderate or high risks for readmission.

The first question the health system wanted to answer, Burke says, is whether using advanced analytic methods, in this case predictive analytics, could improve its scoring. More than 30 models were tested using machine learning and other techniques. “The answer was a resounding yes, we found that those analytic methods provided a dramatic improvement in our ability to predict risk,” Burke says.

The predictive models developed during the effort were more than 30 percent more accurate at identifying patients at risk who were later readmitted to the hospital.

UNC Health Care used the C-Statistic as its standard measure of the predictive accuracy of its logistic regression model. The C-Statistic gives the probability a randomly selected patient who experienced an event, such as a disease or condition, had a higher risk score than a patient who had not experienced the event.

The accuracy of its models approached 0.9 value on the C-Statistic curve (a 1.0 value indicates a perfect value). “You really can’t get much higher than that, which was an essential validation of the infrastructure we put in place,” Burke says. “It may seem obvious that advanced methodologies would yield better results, but many hospitals have used standard methodologies like the LACE index for years, and it’s important that you provide evidence to stakeholders that advanced analytics can be clinically effective.”

The second question the health system wanted answered was if utilizing the ZIP code and person-level socioeconomic data would improve the accuracy of its predictive models. The answer to that question, Burke says, was mixed. Including the sociodemographic data didn’t provide much of a lift in terms of performance of the models, but that’s in large part because the models were so accurate using only EHR data. “We didn’t find that sociodemographic data improved the models, but it did reinforce our understanding of the relationship between the longitudinal clinical data and sociodemographic information. As I tell students when I’m teaching here, there is a correlation between buying milk and buying cereal.”

However, while the sociodemographic data didn’t necessarily improve the accuracy of the risk models, it did make the analyses much more actionable, a critical advantage to care providers. “It’s one thing to know that a patient is at high risk for readmission, it’s another thing to know that patient lives in a food desert, or that they live alone or are under a great deal of financial stress. That type of information gives physicians and care managers an idea of what conversations to have with each individual.”

The big data effort was recently completed, and UNC Health Care is now in the process of devising ways to incorporate the findings into clinical and operational workflows. “Once you have the analytics in place, the trick is devising ways to plug the intelligence into the EHR, which is ideally the way you want caregivers to see it,” Burke says. ”But we also have to determine what data we want to embed into the workflows—we want to present more than just a score, we want that actionable data available too.”

But to get to the point where it could utilize advanced analytics has taken a few years and a significant build in terms of staff and technologies, Burke says. The effort relies on a set of “pillars” UNC Health Care has committed to for creating an advanced analytics environment, including:

1. Analytical product solutions development: Instead of one-off projects, the health system focuses on engineering reusable data and analytics assets that can be used as a single source of truth across the enterprise. “We want to put assets in place so it doesn’t matter what question you’re asking, you can use those assets to get an answer,” Burke says.

2. Data governance: The health system continues to work on data quality algorithms, master data management methodologies, and provisioning information owners and data governance teams to define key assets. “We’re trying to refine the octane of the data fuel we’re putting into our infrastructure,” Burke says.

3. Analytical consulting: Different business units will continue to use their own analytics, most of whom have no experience with inferential analytics, data mining, neural networks or machine learning, among other techniques. Burke’s division has a group of highly trained analysts tasked for providing those skills and/or training to analysts in those business units.

4. End-user enablement: UNC Health Care has launched an effort to lower barriers for end users. “Basically, developing strategies to enable end-users to get their analytics needs met without being trained on 30 different tools?” Burke says.

More for you

Loading data for hdm_tax_topic #better-outcomes...