AI can find, classify pleural effusions on chest x-rays

May 17, 2018

2018 05 17 21 45 6450 Effusion 20180517212616

An artificial intelligence (AI) algorithm has shown promising results for finding and classifying pleural effusion on chest x-ray studies, potentially helping to speed up interpretation times and triage priority cases.

The deep-learning algorithm yielded 95% sensitivity and 100% specificity for detecting the presence of pleural effusion on chest radiographs in a pilot study presented at the recent American Roentgen Ray Society (ARRS) meeting in Washington, DC. Dr. Rita Kanesa-thasan and Dr. Paras Lakhani from Thomas Jefferson University presented the results.

The algorithm also performed well in characterizing effusions by size, although the researchers believe there is still room for improvement in accurately classifying smaller effusions.

"As a resident, I think one key application [for the algorithm] could be its utility in helping triage multiple 'stat' reports," Kanesa-thasan told AuntMinnie.com.

Triage tool

Previously, co-author Lakhani reported that deep convolutional neural networks (CNNs) could accurately classify tuberculosis on chest radiography. Because chest radiographs are frequently encountered in a hospital setting, the researchers sought to explore the potential use of machine learning to help better triage these exams.

Prior studies have also investigated the utility of deep learning for detecting the presence or absence of common pathology on chest radiography, such as the presence of pleural effusion or cardiomegaly, Kanesa-thasan said.

"However, to focus the scope of our initial project, we chose to focus on a commonly encountered entity (pleural effusions) and to assess whether [deep CNNs] could not only detect the presence versus absence of an effusion but also take the next step in classifying the laterality and approximate size of the effusion -- such as a small left pleural effusion versus a large right pleural effusion," Kanesa-thasan said.

Heat maps represent areas of the radiograph that were activated by the deep-learning network and are used to determine the prediction class (large, moderate, or small pleural effusion). In addition to highlighting the meniscus of the effusion, the heat map also includes the lung parenchyma far above the meniscus. The network may be using the degree of aerated lungs to differentiate these classes of effusion, which in this case is a moderate effusion. Image courtesy of Dr. Rita Kanesa-thasan and Dr. Paras Lakhani.

The researchers selected 486 radiographs from their institution's PACS archive after first ensuring they met general standards for image quality; they then verified the effusion size compared with the radiology report. Radiographs were included if the radiology report text had the following terms: "no pleural effusion," "small [right/left] pleural effusion," or "large [right/left] pleural effusion." Cases were excluded if the patient was younger than 18 years or if the lung apex or base(s) were not in view.

The radiographs were then deidentified, stored as JPEGs, divided vertically in half, and resized to 256 x 256 pixels to yield 972 images of a hemithorax. These included 323 control cases with no effusion, 240 small effusions (occupying less than one-third of the hemithorax), 182 moderate effusions (one-third to two-thirds of the hemithorax), and 227 large effusions (more than two-thirds of the hemithorax).

The images were also augmented 32-fold using rotation, Gaussian blurring, and edge and contrast enhancement prior to being used for training the deep-learning model to detect and classify the chest radiographs. Of the 972 images, 8% were randomly selected and set aside for testing the algorithm.

A pretrained CNN

The researchers used a GoogLeNet CNN with a Caffe deep-learning framework that had been pretrained on more than 1 million images from the ImageNet database before encountering the radiographs used in the study. In testing, the CNN correctly identified the presence of an effusion in 76 (95%) of the 80 images (confidence level = 94% ± 5%).

"Interestingly, all false negatives were originally 'small left effusions' that were misclassified by the [deep CNN] to be 'no left effusion,' " Kanesa-thasan said.

The researchers noted that there is still room for improvement in characterizing the size of effusions -- particularly small and moderate ones.

Algorithm performance by size of pleural effusion
	Large effusions	Moderate effusions	Small effusions
Sensitivity	100%	92%	72%
Specificity	98%	99%	99%

"Additionally, we noticed that threshold cases -- between small and moderate or between moderate and large -- were more challenging for the CNN, although those would be tough for humans as well," she said.

Laterality issues

With regards to laterality, the CNN initially determined right- versus left-sided pathology correctly in only 71 (89%) of the effusions. Of the nine incorrect answers, seven were actually large right pleural effusions that were misclassified by the CNN as large left pleural effusions.

"Ultimately, we were able to achieve 100% accuracy for identification of laterality by discontinuing 'horizontal flipping,' a method we had initially used to augment our training dataset," she said.

In future work, the researchers would like to increase the sample size of radiographs, experiment with other deep-learning networks, and develop their statistics to incorporate area-under-the-curve data, Kanesa-thasan said.

"[We] would [also] like to incorporate additional findings that are commonly seen on chest radiographs (a limitation of our study is that it only focuses on effusions) and work toward combining this with other projects in the department," she said.