At Johns Hopkins Medicine, big data and analytics are at the core of the organization’s goal to tailor medical treatments and procedures to individual patients.

Launched in 2012 as the Johns Hopkins Individualized Health Initiative, or Hopkins inHealth, the effort is a collaboration between Johns Hopkins University, which includes the medical school; the health system, and Johns Hopkins Applied Physics Laboratory.

Johns Hopkins Medicine includes 1,192-bed Johns Hopkins Hospital, the system’s flagship; five other hospitals, and 40 outpatient specialty and primary-care sites.

The goal of Hopkins inHealth is to discover new scientific measurements and models to predict the trajectory of diseases in current patients as well as how each patient’s unique genetic makeup is likely to respond to medical treatments and procedures. To fuel those discoveries, Johns Hopkins plans to mine data from myriad sources, including electronic health records, DNA sequences and digital images.

“At its core, big data is about massive amounts of electronic patient information that can be mined to yield tailored medical results,” explains Scott Zeger, director of Hopkins inHealth and a biostatistics professor at Hopkins Bloomberg School of Public Health. Based on those results, physicians—in collaboration with their patients—can develop high-quality and cost-effective treatment plans.

To move from vision to reality, Hopkins inHealth’s leaders are bringing together the university’s resources to maximize the potential of individual projects. Those resources include research funding of pilot projects; access to a range of hardware and software platforms on campus, including supercomputers, clinical cohort databases and measurement devices; and expertise in study design and data analysis.

The Johns Hopkins Hospital
The Johns Hopkins Hospital

In some cases, the initiative also seeks to help connect researchers with external funds and expertise, such as with commercial technology companies. One example, is a partnership between Johns Hopkins and Microsoft to develop a solution to collect, integrate and analyze data from monitoring devices in the ICU. To be built on Microsoft’s Azure cloud platform, the goal is to help physicians spot potential problems in their patients’ medical care that could lead to injuries and complications.

There are many other big data projects underway to improve healthcare outcomes in such areas as radiation oncology, autoimmune diseases, interventional cardiology, cancer screening, prostate cancer, and cystic fibrosis, among others.

Although many of these projects are in the exploration stage, Oncospace, a SQL database and set of clinical support tools to improve treatment planning and medical outcomes in radiation oncology, is already being used in direct patient care.

Oncospace includes electronic medical data on cancers of the head and neck, prostate, pancreas and lung for 2,300-plus patients. Data for each type of cancer is stored in a separate cohort database on the same server and using the same schema.

Quote
At its core, big data is about massive amounts of electronic patient information that can be mined to yield tailored medical results.

While the number of patients in the database is relatively small today, Zeger expects Oncospace to grow in size and its query-processing approach more complex as new types of data and “millions” of new patient cases are added not only from Johns Hopkins but other academic medical centers as well.

The first predictive model Johns Hopkins developed in radiation oncology helps customize radiation treatment plans for cancer patients.

“The goal of radiation therapy is to treat the cancerous tissues while sparing as much as possible the normal tissues that surround it,” says Todd McNutt, director of clinical informatics for radiation oncology and molecular radiation sciences at Johns Hopkins, and the lead researcher and developer on the Oncospace project.

In head-and-neck cancers, for example, physicians want to spare critical anatomy involved in swallowing and talking.

But achieving this goal is not easy because the shape of tumors and the physical space between cancerous tissues and vital organs vary with each patient, making radiation therapy planning a complicated and labor-intensive process.

Also see: Penn Health Sees Big Data as Life Saver

That is why the first algorithm McNutt and co-researchers developed predicts the amount of radiation a vital organ will receive based on the spatial relationship between the vital organ and the cancerous tissue.

The predictions help physicians and medical physicists figure out “how good of a treatment plan we should be able to generate for that new patient,” McNutt explains.

A tool, built using Python, queries and processes the data to make predictions on current patients, McNutt says.

Results generated for each current patient are fed directly into a commercially available radiation therapy treatment-planning system, which imports images from CT scans as well as other information to calculate how to optimize the treatment plan for a given patient.

The information stored in Oncospace includes data on past patients’ radiation therapy plans, results, negative side effects and long-term outcomes, such as if cancer reoccurred. Oncospace also includes three-dimensional definitions of patients’ anatomy based on CT images and 3-D radiation dose distributions, which McNutt describes as very coarse images.

Patient data from MOSAIQ, an electronic medical records system for radiation oncology from Elekta, is pulled into Oncospace automatically through periodic updates. Data from a radiation treatment planning system used at Hopkins—Pinnacle from Phillips—is pushed into Oncospace manually.

McNutt also developed web-based assessment forms to ensure they collect all of the pertinent patient data from both clinicians, via MOSAIQ, and patients, via an iPad app.

The tool for clinicians, which is integrated into MOSAIQ, collects information on both negative side effects and the absence of side effects from radiation oncology, such as loss of swallowing function.

McNutt opted to build a tool for clinicians, which is integrated into MOSAIQ, rather than using natural language processing to extract unstructured data from physicians’ notes in the electronic health record system because physicians don’t always report on the absence of side effects in their notes. As McNutt explains, “As you can imagine, if you’re trying to build a knowledge base, you also want to be definitive when this particular patient did not have a problem.”

Patients contribute data, too, answering questions about their quality of life on an iPad app while sitting in either a waiting room or exam room.

All of this Information is critical to new algorithms that McNutt and his colleagues are in the midst of developing to predict patients’ risk of developing negative side effects from radiation therapy, such as loss of swallowing or speech function or excessive weight loss.

In addition to developing new predictive models focused on medical outcomes, McNutt also would like to expand the type of data stored in Oncospace to include information from such sources as patient biopsies, chemotherapy regimens, surgeries, or even genomic data on individual patients. “That kind of information has the ability to be predictive of outcome, and we don’t have good ties to all of that data,” McNutt says.

McNutt and his co-researchers also will be able to tap into cancer treatment data for many more patients through the Oncospace Consortium, which Johns Hopkins created to facilitate collaboration and data sharing among academic medical centers. Current consortium members include the University of Washington, the University of Toronto-Sunnybrook, and the University of Virginia, and McNutt plans to recruit others.

While still under development, the concept is that each university will manage a database and set of analytic tools internally, but share de-identified patient data with other members of the consortium via a shared website, which also provides access to a shared source-code repository.

Plans also call for modeling the underlying data architecture after a “federated” approach to big data storage and processing originally developed at Johns Hopkins to process multiple terabytes of raw data extracted from digital images of the sky that were taken from telescopes in numerous geographic locations.

Using this approach, query processing for the Oncospace Consortium would be divided among the databases and analytics platforms at each university and the results integrated at a central server, Zeger says. “We can very rapidly make the calculations we need for a particular patient, and get it back to that patient while they are sitting there with their doctor, but the data will stay at the home institution,” Zeger says.

And that is why Oncospace could serve as a model for other big data clinical applications supported by Hopkins inHealth, Zeger says.

Register or login for access to this item and much more

All Health Data Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access