Using machine-learning models with CT images helps clinicians predict interstitial lung abnormalities (ILAs), researchers have reported.
A team led by Akinori Hata, MD, of Brigham and Women's Hospital in Boston developed and tested 12 machine-learning algorithms and, for the top performer, found an area under the receiver operating curve (AUC) for predicting ILAs of 0.87. The findings were published September 3 in Radiology.
"We successfully developed 12 interstitial lung abnormality (ILA) probability prediction models, with some demonstrating high area under the receiver operating characteristic curve values, up to 0.87," the group wrote.
ILAs have been linked to poor clinical outcomes, including progressive respiratory symptoms, physiologic decline, increased incidence of lung cancer, and higher mortality rates, and they may be a precursor to pulmonary fibrosis, the investigators explained. Typically they are identified through visual assessment of CT images based on Fleischner criteria, but visual assessment can be both time-consuming and plagued by reproducibility issues; an automated technique based on machine learning could mitigate these problems.
Hata's team conducted a study that included 1,382 chest CT scans from patients in the Boston Lung Cancer Study collected between February 2004 and June 2017 -- of these, 75% were conducted with contrast -- and developed automated 12 ILA probability prediction algorithms that included section inference and case inference models. The section inference model indicated ILA probability for each CT section and the case inference model generated case-level ILA probability; for the case inference model, the group tested three classifiers: support vector machine (SVM), random forest (RF), and convolutional neural network (CNN).
Hata and colleagues assessed the models using the area under the receiver operating characteristic curve (AUC), and evaluated indeterminate CT exams using two- and three-label methods. The visual assessment of ILAs found on CT imaging by two radiologists and a pulmonologist served as ground truth. The team divided the study exams into a training set (96 exams), a validation set (24 exams), and a test set (1,262 exams).
The group reported the following results from the machine-learning algorithms:
- 8% (104) of the CT exams indicated ILA.
- 36% (492) of the CT exams were categorized as ILA-indeterminate.
- 57% (786) of the CT exams were determined to be without ILA.
Overall, the researchers found that the two-label method and random forest in the case inference model achieved the highest AUC.
Performance of machine-learning algorithms for predicting ILAs on chest CT imaging | |||
---|---|---|---|
Measure | Support vector machine (SVM) | Random forest (RF) | Convolutional neural network (CNN) |
Two-label case inference model | |||
AUC | 0.86 | 0.87 | 0.86 |
Sensitivity | 61% | 61% | 57% |
Specificity | 89% | 90% | 92% |
Three-label case inference model | |||
AUC | 0.86 | 0.85 | 0.85 |
Sensitivity | 71% | 80% | 75% |
Specificity | 86% | 72% | 85% |
Using machine-learning algorithms like those developed by Hata's group to identify ILAs on CT images could improve patient care, but more research is needed, wrote Marianna Zagurovskaya, MD, of Indiana University School of Medicine in Indianapolis, in an accompanying commentary.
"Hata et al present an interesting multitier automated approach to ILA probability scoring involving testing a variety of machine learning models," she noted. "However, the best self-learning and best-performing tool needs to be further validated in multicenter external cohorts with larger volumes of cases, including indeterminate cases."
The complete study can be found here.