EHR data mining identifies undiagnosed genetic diseases

Researchers applied phenotype risk scores to nearly 22,000 genotyped individuals, says Vanderbilt’s Josh Denny, MD.


A new electronic health record data mining technique developed by researchers at Vanderbilt University Medical Center has found that undiagnosed genetic diseases may be more prevalent in the general population than previously assumed.

Researchers mapped the clinical features of more than 1,200 Mendelian diseases into phenotypes captured from the EHR and summarized this evidence as phenotype risk scores to find patterns of symptoms that may be caused by an underlying genetic variant.

By applying these phenotype risk scores to nearly 22,000 genotyped individuals, they uncovered 18 associations between rare variants and phenotypes consistent with Mendelian diseases. And, in 16 patients, the rare genetic variants were associated with severe outcomes such as organ transplants.

Results of their study, published Friday in the journal Science, suggest that patients diagnosed with heart failure, stroke, infertility and kidney failure could actually be suffering from rare and undiagnosed genetic diseases.

The EHR data mining technique was developed by Josh Denny, MD, professor of biomedical informatics and medicine and director of the Center for Precision Medicine, and Lisa Bastarache, lead data scientist with VUMC’s Center for Precision Medicine, as well as a team of collaborators.

“You can look for clusters of diseases that could be organized by a gene,” says Denny, who notes that examining outcomes in EHRs can be valuable in deciding if a genetic variant might be associated with a disease.

It’s a new way of looking at the EHR, according to Denny. “We were surprised by what we found; fundamentally, we’re using the fact that the electronic health record has a lot of diseases and phenotypes tracked in it, and we can mine those and assess them in a rapid way.”

Also See: Vanderbilt execs say system is ready for transition to Epic platform

In particular, Denny contends that assessing health outcomes in EHRs is critical to finding undiagnosed underlying disease. “That’s where we get the sense of the phenotypes themselves,” he adds, noting that the resulting phenotype risk score is high for individuals who are a close match and low for individuals who lack keys features of the disease.

“Phenotype risk scoring can easily be applied in any electronic medical record system that is linked to DNA,” said Bastarache in a written statement. “Our work looked at only a small sample of the human genome, about 6,000 variants. The opportunity for additional discoveries using this method is huge.”

VUMC’s pioneering efforts in precision medicine include BioVU, one of the nation’s largest DNA databanks, with about 250,000 unique samples of human DNA linked to 2.5 million de-identified electronic health records, according to Denny.

“To do exactly what we did, you would definitely need the DNA component as well” as an EHR, Denny concludes. “In the clinical world, I would love to try to turn this into something that’s more like decision support.”

The authors of the study, which was funded by grants from the National Institutes of Health, say their genetic analysis may be able to assist clinicians in arriving at a diagnosis.

In the study, 14 percent of patients with genetic variants affecting the kidney had kidney transplants, and 10 percent with another variant required liver transplants.

“If you understood that they had a genetic cause, and you understood that before they got their transplant, for those conditions there are specific treatments,” adds Denny. “If we can figure out who these people are that have undiagnosed disease like that, there’s something that could be done.”

More for you

Loading data for hdm_tax_topic #better-outcomes...