Quality improvement (QI) research presented at RSNA 2025 reinforced the need for independent, postdeployment evaluation of radiology AI updates introduced under the U.S. Food and Drug Administration's (FDA) Predetermined Change Control Plan (PCCP).
Research fellow Syed Muhammad Awais Bukhari, MD, from University Hospitals, Cleveland Medical Center explained that shared risk factors make it practical to combine low-dose CT (LDCT) for lung cancer screening and CAC scoring. When a CAC scoring algorithm was updated, radiologists noticed a performance shift.Liz Carey
As part of a QI project, cardiothoracic radiologists at University Hospitals (UH) Cleveland Medical Center are using opportunistic screening to complement lung cancer screening and coronary artery calcium (CAC) scoring.
In doing so, radiologists there were able to evaluate a CAC scoring and stent detection algorithm -- after deployment -- and in this case, begin the process of correcting course with the vendor.
UH research fellow Syed Muhammad Awais Bukhari, MD, explained that shared risk factors make it practical to combine low-dose CT (LDCT) for lung cancer screening and CAC scoring. However, when v1.3 of the CAC scoring algorithm was replaced by v1.4, radiologists found that performance shifted, leaving them preferring the earlier version.
"Version 1.4 systematically underestimated CAC scores compared with the previously cleared version 1.3 (p < 0.001)," Bukhari noted in his session. "Radiologists subjectively preferred version 1.3 in the majority of cases," he said, adding that they are not relying on the updated software because of performance issues that included a change in AI-determined CAC risk categories, according to Bukhari.
UH deployed the AI tool institutionally for opportunistic automated CAC scoring in LDCT scans obtained for lung cancer screening. Following the FDA's PCCP framework, the vendor updated the AI tool systemwide, promoting it for "enhanced diagnostic performance," "added functionality," and "enhanced stent detection."
Internal improvements with the updated version 1.4 included adding training data and additional model parameters, improving CAC segmentation, and enhancing stent detection.
The AI tool runs automatically in the background to incidentally detect coronary artery calcium.
Researchers studied the tool's performance with 110 patients (57% male, 53% female; mean age, 65). All LDCT scans were performed from November 1 to November 30, 2025, on multidetector CT scanners. Acquisition parameters included 120 kV tube voltage and 40-60 mAs tube current with noncontrast, nongated technique, according to Bukhari.
Each LDCT exam was analyzed using two versions of the FDA-cleared CAC scoring algorithm. The group found that CAC v1.4 demonstrated improved coronary artery stent detection, with the updated version correctly identifying eight of 10 known to be in the cohort. In contrast, version 1.3 identified none.
"This claim of the AI vendor was good," Bukhari said.
However, in patients without stents, the updated software produced a mean CAC score significantly lower than the earlier v1.3 (288.5 compared with 452.7), the group noted. For example, 18 cases shifted from moderate to minimal (1-99), and one shifted from severe to minimal. This prompted concerns about cardiovascular risk misclassification, they noted.
Deploying AI tools using a platform approach makes it possible to assess performance, according to Bukhari. Reverting to the earlier software version has not been automatic and the group is actively collaborating with the vendor on how to move forward with the tool, Bukhari noted.
Ultimately, Bukhari emphasized, "we need the radiologist in the loop ... we need to keep AI tools in check."














