A new approach to a two-stage, deep-learning AI model for CT exams that detect adrenal glands and nodules could minimize false-positive results, reduce misdiagnoses and, when combined with human interpretation, decrease radiologists' workload by 10%, according to research published March 4 in Radiology.
Chang Ho Ahn, MD, from the Seoul National University College of Medicine, and Tae Woo Kim, from the Korea Advanced Institute of Science and Technology, both in South Korea, developed the model. Their findings are important because adrenal nodules are often incidentally detected at abdominal CT examinations that are not specifically focused on evaluating the adrenal glands.
"With low and high thresholds, about 90% of patients could be classified with high confidence as having adrenal nodules using the deep-learning model," Ahn and Kim's team wrote.
A total of 995 patients met the criteria for the study. These patients had undergone contrast-enhanced abdominal CT for the evaluation of adrenal disease or other clinical indications, and their complete clinical and imaging data were available. Those excluded had a history of major abdominal surgery, known malignancy, poor image quality, and lack of contrast enhancement.
An adrenal nodule was defined as having the longest diameter of greater than 1 cm, based on clinical guidelines. Stage I of the model was designed to detect and locate the adrenal gland and/or adrenal nodules. Stage II then provided information such as the location, contour, and probability of an adrenal nodule, according to the authors, who differentiated their model's approach from a segmentation-first classification model.
Example CT scans in a 77-year-old man with a left adrenal nodule. First panel shows raw CT scan. Second panel shows the annotated ground truth, the definitive reference for adrenal nodules. Light blue is the area of normal adrenal gland, and darker blue is the area of an adrenal nodule. Third panel shows the bounding box, a rectangular border that encloses adrenal nodules, generated by the first stage of the deep-learning model. Fourth panel shows the output of the second stage of the deep-learning model. The predicted area of an adrenal nodule is marked in red. Images and caption courtesy of the RSNA.
Several datasets were employed, including a large external dataset reflecting real-world clinical practice, according to the authors.
Results included the following:
- Correct detection of adrenal gland in 99.3% of patients for the right adrenal gland and in 96.1% of patients for the left.
- High sensitivity in detecting adrenal nodules, with an area under the receiver operating characteristic curve (AUC) of 0.99 for the right adrenal gland and 0.93 for the left.
- Performance moderate in characterizing nodules as adenoma and nonadenoma cases, with an AUC of 0.82.
- High accuracy when applied to the external dataset, with AUCs of 0.98 and 0.97 for the right and left adrenal nodules, respectively.
- Triaging performance (the percentage that could be used to confidently classify by deep learning) ranged from 77% to 98%.
"The sequential use of the detection and segmentation model is a unique characteristic of our deep learning model," Ahn and Kim's team wrote. Also, the separate training and validation of the deep-learning model for left and right adrenal nodules is a strength of the study, they added.
Limitations, they said, involved the origin of the radiology reports used, and the CT scans or protocols were also heterogeneous. Higher accuracy of the deep learning algorithm compared with the radiology report does not imply superiority over human interpretation, noted Ahn and Kim's team, adding that noncontrast CT is often preferred for assessing CT attenuation of adrenal nodules.
The group's approach is unlike conventional models, according to Ashkan Malayeri, MD, and Baris Turkbey, MD, from the departments of radiology and imaging sciences and AI resource at the National Institutes of Health, in an accompanying editorial.
"The authors developed an innovative design to merge the radiologists’ high specificity with the AI algorithm’s high sensitivity to combine the strengths of both AI and radiologists," they wrote.
Ahn and Kim's team also addressed an obstacle in AI research -- lack of reproducibility across different scanners, acquisition parameters, and datasets -- by testing their model on a sizeable external test set of scans from all major CT manufacturers, Malayeri and Turkbey noted.
A future vision of AI in this context may entail not only detecting and segmenting nodules, but also analyzing them across multiple phases, extracting the radiomic data, and providing an objective predictability score on the malignant versus benign nature of the nodules, they wrote.
Read the full study here.