Vanderbilt helps researchers find linkages between genotype, EHR data

A Nashville, Tenn.-based healthcare provider is leveraging informatics, including natural language processing, to find linkages between de-identified genotype data and electronic health record data.


A Nashville, Tenn.-based healthcare provider is leveraging informatics, including natural language processing, to find linkages between de-identified genotype data and electronic health record data.

Launched in late 2016, Vanderbilt University Medical Center’s Phenotyping and PheWAS Core Services are available to researchers from VUMC and other institutions.

The services, offered under VUMC’s Center for Precision Medicine, include clinical phenotyping—deriving phenotypes from EHR data—to uncover clinical knowledge, gene-disease relationships (genomics, GWAS, and PheWAS), as well as gene-drug-outcome relationships (pharmacogenomics).

According to VUMC, the mission is to apply advanced informatics methods to “understand” unstructured—and sometimes inaccurate—biomedical text and electronic medical record data.

“EHR data can be messy, but it contains perhaps the single richest source of disease history, drug exposures and their response, and prognosis available for research,” says Josh Denny, MD, director of the Center for Precision Medicine and vice president for personalized medicine.

“Using advanced informatics through our core, we can help researchers go beyond what might be gleaned from demographics, billing codes and other structured data, to draw finer distinctions within the patient population based on unlabeled data, such as information in clinical notes,” adds Denny.

VUMC-CROP.jpgAlso See: EHR data mining identifies undiagnosed genetic diseases

PheWAS, which stands for phenome-wide association study, consists of taking a genetic variant of interest and using custom algorithms to scan for associations with International Classification of Diseases codes appearing in the EHR.

“The most common PheWAS approach is using billing codes,” observes Denny. “They’re imperfect, but they’re still very valuable.”

VUMC’s vast clinical repositories include medical records for about 2.8 million patients as well as DNA specimens from a subset of its EHR population—approximately 240,000 patients—with more than 90,000 of these specimens having been genotyped.

“Their ability to work with investigators to move from what may at first appear to be a simple clinical question, through the complexities of identifying a research cohort and defining the clinical outcome, is tremendously valuable,” said Sara Van Driest, MD, assistant professor of pediatrics and medicine.

“They have specific expertise in helping to translate clinical questions into electronic algorithms that make it possible to study large cohorts of patients,” added Van Driest.

More for you

Loading data for hdm_tax_topic #better-outcomes...