AI model's sensitivity for aneurysms comes at a price

Nov 2, 2020

2020 11 02 23 36 2729 2020 11 03 Long Cta Aneurysm Figure4 20201102235731

An artificial intelligence (AI) algorithm helped improve the detection of cerebral aneurysms on CT angiography (CTA) scans, but at the cost of a high false-positive rate in a study conducted in China and reported online November 3 in Radiology.

In the retrospective study, a deep-learning model that had been trained in the detection of cerebral aneurysms using 534 patient cases was associated with an improvement in sensitivity for radiologists' detection of aneurysms in a similarly sized, in-house validation dataset.

Out of 649 aneurysms present in the in-house validation dataset, 633 were detected with the deep-learning algorithm -- a sensitivity of 97.5%, albeit with a rate of 13.8 false-positive findings per case. It was also possible to pick up on findings that had initially been missed; eight additional aneurysms were detected that hadn't been seen before. The performance of general radiologists in the external validation set was significantly improved by 0.01 on the area under the receiver operating characteristic (ROC) curve.

"This improvement was found to be dependent on the level of experience," Dr. Xi Long, PhD, of the department of radiology at Tongji Medical College's Union Hospital in Wuhan, China, and colleagues reported.

The study was conducted to assess the value of AI in assisting with what can be a challenging task, especially for radiologists with less experience. Cerebral aneurysms cause from 80% to 90% of nontraumatic subarachnoid hemorrhages, and the mortality rate is high at up to 51%.

Since they are small and it's possible to miss them on a first review, double reading may be conducted to decrease the false-negative rate for aneurysms to 1%, and a supportive AI algorithm could help both physicians and patients, they wrote.

The study involved a retrospective review of images for 1,068 patients who had undergone CT angiography of the head or head and neck at two hospitals affiliated with the Huazhong University of Science and Technology in China. Half of the cases were used for training the algorithm and the other half for internal testing/validation.

Researchers also had an additional dataset of 400 CT angiograms for independent, external validation. The external validation set was tested on four general radiologists with varying levels of experience (from one to seven years' experience in CT angiography of the head) and who were blinded to case histories and reports.

The algorithm was associated with improvements in sensitivity (see table), with the less experienced radiologists benefiting the most from automated support. Specificity per case was lower with the algorithm. The reading time was also slower, though this was not a statistically significant result.

Impact of AI on performance of radiologists for cerebral aneurysms
Parameter	Without AI algorithm	With algorithm	Difference	p-value
Area under the wAFROC curve*	0.60	0.61	+0.01	0.02
Sensitivity per lesion (%)	79.1%	88.9%	+9.85	< 0.001
Sensitivity per case (%)	81.6%	91.9%	+10.23	<.001
Specificity per case (%)	95.9%	90.9%	-5.01	0.003
Reading time (seconds)	53%	49.4%	-3.62	0.25

wAFROC = weighted alternative free-response receiver operating curve. Source: Long et al.

Long et al noted that data on the use of AI for detection of cerebral aneurysms has been more limited for CT angiography compared with MR angiography. And the AI studies that have been conducted with CT angiography have had limitations, such as analysis of a small number of aneurysms overall, or exclusion of small, harder-to-detect aneurysms (under the size of 3 mm).

An AI algorithm was helpful for improving the detection of cerebral aneurysms on CT angiography images, though there were some limitations. In a 54-year-old woman, an aneurysm was missed, possibly because of its small size (3 mm) and location near the skull base. Image courtesy of Radiology.

A closer look at false positives

However, the authors also flagged study limitations, such as the retrospective design and false positives "in areas with bony structures, veins, and vascular bifurcations and curvatures and calcified plaques." They are working to improve the algorithm.

Given its large size, external validation, and inclusion of small aneurysms, the study represents an advance after many years of "unrealized hopes of rapid, reliable, and easy aneurysm detection," wrote Mayo Clinic specialists Dr. David Kallmes and Dr. Bradley Erickson, PhD, in an editorial about the data.

Among other issues, Kallmes and Erickson noted their concern about the high false-positive rate of 13.8 to reach the highest sensitivity reported in the study -- 97.5%.

"A more manageable number of false-positive findings of four to five per case dropped sensitivities to the low 90% range, not a major value added over the unaided reader," they noted.

The editorial authors would have also liked to have seen more information on how the tool performed in ruptured versus unruptured aneurysms, and the inclusion of a reference standard, such as digital subtraction angiography. Furthermore, they called for a greater understanding of how computer-aided detection tools should be integrated into practice.