Machine learning can understand text reports written by radiologists
Researchers from the Icahn School of Medicine at Mount Sinai have leveraged natural language processing algorithms to automatically identify clinical concepts in radiologist reports for computed tomography scans.
Using more than 96,000 radiologist reports associated with head CT scans performed at The Mount Sinai Hospital and Mount Sinai Queens, researchers trained the computer software to understand text reports written by radiologists, achieving an accuracy of 91 percent. The NLP algorithms were used to teach the computer clusters of phrases, including words such as phospholipid, heartburn and colonoscopy.
Results of the study were published this week in the journal Radiology.
“The language used in radiology has a natural structure, which makes it amenable to machine learning,” says senior author Eric Oermann, MD, an instructor in the Department of Neurosurgery at the Icahn School of Medicine. “Machine learning models built upon massive radiological text datasets can facilitate the training of future artificial intelligence-based systems for analyzing radiological images.”
According to Oermann, the text in radiologist reports is “often a lot simpler” than that found in English-language novels, news articles and even medical discharge summaries in terms of their syntax and lexicon—which are more complicated.
“The actual language of radiology as a field is markedly different from normal English, which makes it a more tractable problem with natural language processing,” contends Oermann. “Radiology, specifically, benefits from the fact that the language is highly standardized—especially in the modern EHR era with a lot of templates that are used.”
“The success of this approach benefits from the standardized language of these reports,” conclude the study’s authors.
However, Oermann acknowledges that “it’s a big jump” to go from an interpretation of a report to an interpretation of an image and then to a diagnosis—an as-yet unsolved problem. “We’re very much in the early stage of machine learning in healthcare.”
Other research has highlighted the challenges still facing machine learning in analyzing free text in radiology reports. Recently, study results published in the Journal of the American College of Radiology concluded that allowing radiologists to report their findings using free text rather than in structured templates increases their variability in language and length, making them harder to use—and more difficult for machine learning to predict diagnoses. That finding suggests that structured templates for radiology reports could improve diagnostics, make results easier to understand, enhance billing and assist in population health.
Nonetheless, Oermann and his co-authors see the machine learning technique they studied as an important building block in the development of artificial intelligence that could interpret scans and diagnose conditions.
“The ultimate goal is to create algorithms that help doctors accurately diagnose patients,” says first author John Zech, a medical student at the Icahn School of Medicine. “Deep learning has many potential applications in radiology—triaging to identify studies that require immediate evaluation, flagging abnormal parts of cross-sectional imaging for further review, characterizing masses concerning for malignancy—and those applications will require many labeled training examples.”