NIH Clinical Center releases CT image dataset to researchers

Register now

The National Institutes of Health has released one of the largest publicly available datasets of CT images to the scientific community to improve detection accuracy for lesions.

The large-scale dataset, called DeepLesion, has more than 32,000 annotated lesions identified on CT images. That’s significantly larger than most publicly available medical image datasets, most of which have fewer than a thousand lesions.

In addition, most currently available lesion medical image datasets can only detect one type of lesion. However, the DeepLesion database contains different kinds of critical radiology findings from across the body, such as lung nodules, liver tumors and enlarged lymph nodes.

Prior to the release, the data from 4,400 unique patients was thoroughly anonymized, according to NIH. The images come from patients at the NIH Clinical Center, a research hospital, where radiologists measured and marked clinically meaningful findings with “electronic bookmarks” which are what scientists used to develop the DeepLesion dataset.

Also See: NIH releases data from adolescent brain development study

The agency is hoping that by releasing the de-identified dataset to researchers they will be able to teach computers how to read and process extremely large amounts of complex images like CT scans.

“The dataset released is large enough to train a deep neural network—it could enable the scientific community to create a large-scale universal lesion detector with one unified framework,” NIH officials believe.

By developing a universal lesion detector, radiologists will be able to find all types of lesions, which “may open the possibility to serve as an initial screening tool and send its detection results to other specialist systems trained on certain types of lesions,” according to the agency. The goal is also to more accurately and automatically measure sizes of all lesions, as well as to mine and study the relationship between different types of lesions.

Going forward, the NIH Clinical Center plans to collect more data for the DeepLesion dataset to improve its detection accuracy.

“The universal lesion detecting capability will become more reliable once researchers are able to leverage 3-D and lesion type information,” according to the agency. “It may be possible to further extend DeepLesion to other image modalities such as MRI and combine data from multiple hospitals, as well.”

For reprint and licensing requests for this article, click here.