AI-based model enhances diagnosis of brain MRI exams

Aug 6, 2019

An artificial intelligence (AI)-based model that mimics a radiologist's image interpretation process can provide automated diagnosis of brain MRI exams at a level comparable to an academic neuroradiologist, according to researchers from the Hospital of the University of Pennsylvania in Philadelphia.

The researchers, led by Dr. Jeff Rudie, PhD, developed and tested a proof-of-concept system that utilizes advanced image-processing techniques, deep learning, and Bayesian networks to generate a differential diagnosis for brain MRI studies.

"The system currently performs at a level of academic neuroradiologists for 35 common and rare diseases on clinical brain MRIs and augments radiology resident accuracy," said Rudie, who presented the research group's results during a scientific session at the recent Society for Imaging Informatics in Medicine (SIIM) annual meeting in Denver. Rudie performed the research while at the University of Pennsylvania; he is now a fellow at the University of California, San Francisco.

Modeling radiologist interpretations

With their approach to automated diagnosis, the researchers sought to model the fundamental steps a radiologist performs when interpreting images. They developed an image-processing pipeline consisting of 3D convolutional neural networks (CNNs), multiatlas tissue segmentation, and other image-processing methods for the tasks of identifying, localizing, and characterizing image abnormalities.

The automated system was trained and validated using 390 clinically validated brain MRI scans, a dataset that included 212 cases with a focus on 35 common and rare diagnostic entities involving deep gray matter. The multimodal brain MRI studies included T1-weighted, T2-fluid-attenuated inversion recovery (FLAIR), T1 postcontrast, gradient-recalled echo (GRE) sequences, as well as diffusion-weighted imaging (DWI) using apparent diffusion coefficient (ADC) values. Of these 390 cases, 102 deep gray-matter cases -- including two to three examples of each disease -- were set aside for testing.

In comparison with ground-truth features extracted by attending radiologists, the automated image-processing pipeline yielded an overall accuracy of 89% for extracting 11 quantitative image features:

Signal features: T1 signal, FLAIR signal, susceptibility, enhancement, and restricted diffusion
Anatomic subregion features: caudate, putamen, pallidum, and thalamus
Spatial features: bilateral and symmetry

Generating differential diagnoses

The second step of their automated system makes use of Bayesian networks, which are probabilistic models of relationships between multiple variables. Their Bayesian network models the signal, spatial, and anatomic subregion features extracted from the image-processing pipeline, along with a patient's clinical information, to provide a differential diagnosis list with estimated probabilities of specific diseases. Expert neuroradiologists in consensus determined the conditional probabilities of the Bayesian network, Rudie said.

The researchers then compared the performance of their automated image-processing pipeline for producing a correct differential diagnosis -- i.e., within the top three diagnoses -- with that of four radiology residents, two general radiologists, two neuroradiology fellows, and two academic neuroradiology attendings.

Performance for producing a correct differential diagnosis
	General radiologists	Radiology residents	Neuroradiology fellows	Academic neuroradiology attendings	Automated system
Accuracy	53%	56%	72%	84%	85%

The differences in accuracy between the automated system and the general radiologists, radiology residents, and neuroradiology fellows were all statistically significant (p < 0.001). Delving further into the results, the researchers observed that the automated system's performance advantage was due largely to its results in rarer diseases.

"In particular, we noticed that radiology residents and general radiologists do the worst with these rare diseases, whereas the automated system still did quite well with the rare diseases," Rudie said.

Augmenting radiologist accuracy?

To determine if their automated approach could yield improvements in radiologist accuracy, the researchers had three other radiology residents use their institution's Adaptive Radiology Interpretation & Education System (ARIES) to review and interact with the results of the automated pipeline for half of the test cases. The residents were able to increase their differential diagnosis accuracy from a mean of 59% before using the automated system to 87% after.

"Looking at those cases, the residents were able to increase their performance almost to the level of the academics," Rudie said.

In preliminary results, the researchers have also found that the automated system did not show a benefit for augmenting the performance of the academic neuroradiology attendings in providing a correct differential diagnosis. Interestingly, the researchers noted a trend toward providing a higher percentage of the exact correct diagnosis, Rudie said.

Rudie noted that with the team's two-step approach, the intermediate quantitative features initially extracted by the CNNs and image-processing techniques can be used by secondary models to support diagnosis, prognosis, and scientific discovery.

"Additionally, these intermediate features allow models to be more explainable, so less of a 'black box'; you can understand how the system might be making mistakes," Rudie said. "It also allows for interpretation of the full gamut of common and rare diseases. Because we're incorporating expert knowledge into the model, it doesn't require a large sample of the rare diseases."