Using an AI triage protocol for the detection of intracranial hemorrhage (ICH) did not improve radiologists' diagnostic performance or report turnaround times, according to a study published September 4 in the American Journal of Roentgenology.
The findings may be surprising in an era of intense enthusiasm for the use of AI in healthcare. A team led by Cody Savage, MD, of the University of Maryland in Baltimore noted that "radiologists alone outperformed AI alone for ICH detection. In addition, use of the AI algorithm did not improve radiologists’ diagnostic performance or report process times."
ICH is a major cause of morbidity and mortality around the world, and accurate identification of it translates to improved patient outcomes, the group explained. The standard of care for diagnosis of the condition is noncontrast CT of the head interpreted by a radiologist. But radiologists aren't always immediately available to promptly read these exams, and waiting for interpretation can cause delays in diagnosis that can negatively impact patients.
To mitigate this problem, some health systems are using AI to triage and notify radiologists about positive ICH results on CT imaging. Previous studies that have evaluated AI algorithms for ICH detection on noncontrast CT have suggested the technology shows promise for this indication, but confirmation of its efficacy has remained elusive, Savage and colleagues noted.
The investigators evaluated the impact of an AI triage and notification system (Aidoc, Tel Aviv, Israel) on head noncontrast CT exams on radiologists' performance for ICH detection and report turnaround times via a study that included data from 7,371 patients who underwent 9,954 exams between May to June 2021 (phase I) and September to December 2021 (phase II). Before commencing phase I, the radiology department began using a commercial AI triage system for ICH detection that processed noncontrast CT exams and notified radiologists of positive results through a widget with a pop-up display.
Neuroradiologists or emergency radiologists interpreted all of the exams without AI assistance (phase I, 24 neuroradiologists or emergency radiologists) and with it (phase II, 25 neuroradiologists or emergency radiologists). Six reviewing radiologists assessed all exams that had discordance between the report and AI to set a reference standard. The investigators compared diagnostic performance and report turnaround times and assessed five diagnostic performance metrics.
In phase I, 19.8% of exams showed ICH, while in phase II, 21.9% showed the condition, the team reported. It also found the following:
Radiologist diagnostic performance, without and with the use of AI | |||
---|---|---|---|
Measure | Without AI | With AI | p-value |
Accuracy | 99.5% | 99.2% | > 0.01 |
Sensitivity | 98.6% | 98.9% | > 0.01 |
Specificity | 99.8% | 99.3% | > 0.004 |
Positive predictive value | 99% | 99.7% | > 0.01 |
Negative predictive value | 99.7% | 99.7% | > 0.01 |
Mean report turnaround time for exams that indicated ICH was 147.1 minutes without the use of AI compared with 149.9 minutes with its use (p = 0.11).
The study results suggest that a thorough assessment of the benefits of AI in the radiology department is needed, according to the authors.
"The present findings underscore the importance of prospectively evaluating AI-human interaction in real-world clinical settings, where numerous unpredictable factors can influence outcomes in ways not captured by retrospective research designs," they concluded.
The complete study can be found here.