An artificial intelligence (AI) algorithm helped improve the detection of cerebral aneurysms on CT angiography (CTA) scans, but at the cost of a high false-positive rate in a study conducted in China and reported online November 3 in Radiology.
In the retrospective study, a deep-learning model that had been trained in the detection of cerebral aneurysms using 534 patient cases was associated with an improvement in sensitivity for radiologists' detection of aneurysms in a similarly sized, in-house validation dataset.
Out of 649 aneurysms present in the in-house validation dataset, 633 were detected with the deep-learning algorithm -- a sensitivity of 97.5%, albeit with a rate of 13.8 false-positive findings per case. It was also possible to pick up on findings that had initially been missed; eight additional aneurysms were detected that hadn't been seen before. The performance of general radiologists in the external validation set was significantly improved by 0.01 on the area under the receiver operating characteristic (ROC) curve.
"This improvement was found to be dependent on the level of experience," Dr. Xi Long, PhD, of the department of radiology at Tongji Medical College's Union Hospital in Wuhan, China, and colleagues reported.
The study was conducted to assess the value of AI in assisting with what can be a challenging task, especially for radiologists with less experience. Cerebral aneurysms cause from 80% to 90% of nontraumatic subarachnoid hemorrhages, and the mortality rate is high at up to 51%.
Since they are small and it's possible to miss them on a first review, double reading may be conducted to decrease the false-negative rate for aneurysms to 1%, and a supportive AI algorithm could help both physicians and patients, they wrote.
The study involved a retrospective review of images for 1,068 patients who had undergone CT angiography of the head or head and neck at two hospitals affiliated with the Huazhong University of Science and Technology in China. Half of the cases were used for training the algorithm and the other half for internal testing/validation.
Researchers also had an additional dataset of 400 CT angiograms for independent, external validation. The external validation set was tested on four general radiologists with varying levels of experience (from one to seven years' experience in CT angiography of the head) and who were blinded to case histories and reports.
The algorithm was associated with improvements in sensitivity (see table), with the less experienced radiologists benefiting the most from automated support. Specificity per case was lower with the algorithm. The reading time was also slower, though this was not a statistically significant result.
Impact of AI on performance of radiologists for cerebral aneurysms | ||||
Parameter | Without AI algorithm | With algorithm | Difference | p-value |
Area under the wAFROC curve* | 0.60 | 0.61 | +0.01 | 0.02 |
Sensitivity per lesion (%) | 79.1% | 88.9% | +9.85 | < 0.001 |
Sensitivity per case (%) | 81.6% | 91.9% | +10.23 | <.001 |
Specificity per case (%) | 95.9% | 90.9% | -5.01 | 0.003 |
Reading time (seconds) | 53% | 49.4% | -3.62 | 0.25 |
Long et al noted that data on the use of AI for detection of cerebral aneurysms has been more limited for CT angiography compared with MR angiography. And the AI studies that have been conducted with CT angiography have had limitations, such as analysis of a small number of aneurysms overall, or exclusion of small, harder-to-detect aneurysms (under the size of 3 mm).
A closer look at false positives
However, the authors also flagged study limitations, such as the retrospective design and false positives "in areas with bony structures, veins, and vascular bifurcations and curvatures and calcified plaques." They are working to improve the algorithm.
Given its large size, external validation, and inclusion of small aneurysms, the study represents an advance after many years of "unrealized hopes of rapid, reliable, and easy aneurysm detection," wrote Mayo Clinic specialists Dr. David Kallmes and Dr. Bradley Erickson, PhD, in an editorial about the data.
Among other issues, Kallmes and Erickson noted their concern about the high false-positive rate of 13.8 to reach the highest sensitivity reported in the study -- 97.5%.
"A more manageable number of false-positive findings of four to five per case dropped sensitivities to the low 90% range, not a major value added over the unaided reader," they noted.
The editorial authors would have also liked to have seen more information on how the tool performed in ruptured versus unruptured aneurysms, and the inclusion of a reference standard, such as digital subtraction angiography. Furthermore, they called for a greater understanding of how computer-aided detection tools should be integrated into practice.