DENVER - An artificial intelligence (AI) algorithm performed as well as an academic neuroradiologist for identifying diseases in the cerebral hemispheres on brain MRI scans in a new study, presented Wednesday at the Society for Imaging Informatics in Medicine (SIIM) annual meeting.
A research team led by Dr. Andreas Rauschecker, PhD, of the University of California, San Francisco (UCSF) trained a deep-learning algorithm that can automatically identify lesions on brain MRI, quantify imaging and lesion characteristics, and then produce a differential diagnosis for 19 different diseases in the cerebral hemispheres.
In testing, the model outperformed radiology residents, general radiologists, and neuroradiology fellows. What's more, it achieved comparable overall statistical performance to academic neuroradiology attending physicians but higher accuracy for diagnosing rare diseases.
Rauschecker, a neuroradiology fellow at UCSF, presented work that he and his co-authors began during his residency at the University of Pennsylvania and continued during his last year at UCSF.
Three fundamental steps
To help radiologists deal with the increase in the volume and complexity of imaging studies, the researchers sought to develop an AI-based system that explicitly and computationally models the three fundamental steps that radiologists perform during image interpretation:
- Identification of image abnormalities
- Characterization of image findings
- Integration of imaging findings to create a differential diagnosis
"This type of sequential approach has distinct advances rather than going directly from image to diagnosis," Rauschecker said.
For the first phase -- identification of image abnormalities -- the researchers trained a U-Net convolutional neural network (CNN) using 295 MRI scans acquired with a fluid-attenuated inversion recovery (FLAIR) protocol. Cases had varying pathology, scanners, and acquisition parameters, and the algorithm searched for 19 different diseases affecting the cerebral hemispheres of the brain. These diseases represent a variety of pathologies with overlapping appearances on MRI.
In evaluation on 92 independent test cases, the CNN performed nearly as well as a radiologist for identifying and manually segmenting the abnormalities. Predicted lesion volumes correlated highly with true lesion volume, according to the researchers.
Next, the team applied a combination of advanced image processing techniques to characterize image findings and produce computer-generated lesion descriptions. In the final step, the group utilized Bayesian inference to integrate these imaging findings to create a differential diagnosis. Expert neuroradiologists, taking into account various imaging features and five clinical features, set the conditional probabilities for each of the 19 diseases, Rauschecker explained. The algorithm then produced a differential diagnosis of three possible conditions, each with a probability percentage.
The researchers validated the performance of the algorithm on the 92 test cases, which included approximately five test cases per disease for the 19 different diseases. They also recruited four radiology residents, two general radiologists, two neuroradiology fellows, and two academic neuroradiology attendings to provide the top three differential diagnoses on the 92 cases; these results were then compared with those provided by the AI algorithm.
CNN accuracy in detecting 19 diseases of the cerebral hemispheres | |||||
Radiology residents | Community radiology attendings | Neuroradiology fellows | Academic neuroradiology attendings | AI algorithm | |
Accurate differential diagnosis (correct diagnosis within top three) | 56% | 56% | 77% | 86% | 89% |
Area under the receiver operating characteristic curve | 0.731 | 0.718 | 0.849 | 0.904 | 0.920 |
With the exception of the academic neuroradiology attendings, all differences in the accuracy of differential diagnosis with the AI algorithm were statistically significant (p < 0.01). The AI algorithm performed comparably, overall, to academic neuroradiology attendings but was more accurate (p < 0.05) in both moderately rare cases (89% versus 81%) and very rare cases (78% versus 66%), according to the researchers.
Rauschecker noted that users can access all of the features that the algorithm utilized to make its diagnosis.
"You can use those features in any way that you want, for example, to prepopulate radiology reports, etc.," he said.