AuntMinnie 2018: NIH releases massive database of chest x-rays

Am Logo Main 25 Anniversary Thumbnail 66635a3c44252 66ae4e8a1f10e

Editor's note: As part of the celebration of AuntMinnie.com's upcoming 25th anniversary, we're presenting 25 for 25 -- a series featuring our most popular content for each of the last 25 years. New articles will be published each Monday until our official anniversary at RSNA 2024. Our top article in 2018 covered an important early development that supported efforts to train AI algorithms.

With the goal of spurring research into medical applications of artificial intelligence, the U.S. National Institutes of Health (NIH) has released to the public a database of more than 100,000 chest x-rays and corresponding data.

The database is designed to allow researchers from across the U.S. and the world to access datasets that can be used to test artificial intelligence algorithms. Such algorithms can help teach computers how to detect and diagnose disease, the NIH said in a September 27 announcement.

The database was created from studies of 30,000 patients who were seen at the NIH Clinical Center in Bethesda, MD; the information has been anonymized to protect patient identities. A number of the images are of patients with advanced lung disease, according to the NIH. The NIH Clinical Center is the country's largest hospital devoted entirely to clinical research, and patients who are seen there voluntarily agree to participate in clinical trials.

The NIH database includes more than 100,000 chest x-rays. Image courtesy of the NIH.The NIH database includes more than 100,000 chest x-rays. Image courtesy of the NIH.

The NIH noted that interpreting a chest x-ray is a complex reasoning problem that "often requires careful observation and knowledge of anatomical principles, physiology, and pathology." These factors can make it more difficult to develop a consistent and automated technique for interpreting chest x-rays while also considering all possible thoracic diseases.

By making the dataset publicly available, the NIH hopes that academic and research institutions will be able to develop computer algorithms for reading and processing an extremely large number of images, confirming the results that radiologists have found and potentially identifying other findings as well.

In particular, algorithms could detect subtle changes that might occur over multiple x-rays, benefit patients in developing countries who don't have access to radiologists to read their x-rays or help researchers create a "virtual radiology resident" who can later be taught to read more complex images such as MRI and CT.

The database is described in a paper published earlier this year on the website of the Computer Vision Foundation.

Page 1 of 2
Next Page