Radiology must consider at least five factors when it comes to validating AI algorithms for use in the department, according to a review published May 21 in the Journal of the American College of Radiology.
Why? Because thorough validation of AI models is crucial for effective patient care, and as the U.S. Food and Drug Administration (FDA) continues to clear more and more algorithms for radiology, "successful and clinically meaningful deployment of AI will be contingent on the rigorous external validation of these models to select the most effective algorithm for a specific target population," wrote a team led by PhD candidate Ojas Ramwala of the University of Washington in Seattle.
AI has shown promise for streamlining radiologists' workflows, from distinguishing normal from abnormal mammograms or chest x-rays to helping predict cardiac disease risk. But the technology has its downsides, including limited generalizability and a vulnerability to biased training sets -- both of which can translate into healthcare disparities.
The impact of AI models on clinical outcomes must be evaluated for accuracy and generalizability before they are integrated into radiology workflows, but external validation can be tricky as departments grapple with patient data/ethical concerns and the cost of evaluating AI algorithms.
Ramwala and colleagues listed five important factors to consider in order to validate potential AI algorithms for the radiology department:
- Choose between onsite or cloud-based validation. "Healthcare organizations can validate the performance of AI algorithms either by installing the models on-site or by hosting them on a cloud infrastructure," the group wrote.
- Take patient privacy seriously. "Infrastructures for rigorous external validation of AI algorithms must be equipped to address security concerns associated with protected health information," Ramwala and colleagues noted.
- Collect data well. "Appropriate collection of high-quality imaging data is pivotal to ensuring a faithful pipeline for validating AI models," they wrote. "The data distribution must reflect the real-world target population, and the statistical plan and sampling techniques should account for an adequate sample size to look at AI performance in subpopulations."
- Assess computational requirements. Since deep-learning models have different parametric complexities, their computational requirements can vary," the group explained. "In the absence of sufficient computing power, AI models cannot execute their scoring processes."
- Create a scoring protocol. "To reliably validate AI models, comprehensive documentation explaining the implementation steps necessary for inferencing each radiology exam must be received, and the infrastructure should be equipped to systematically execute those programs," the authors wrote.
The task of validating AI models can be complex, but it's necessary, according to the authors.
"Establishing dedicated mechanisms for safeguarding patient privacy … [and] managing imaging data collection, resource allocation, and algorithm inferencing are all major tasks required for developing infrastructures to comprehensively evaluate AI model performance," they concluded. "[We] hope that this overview will help institutions adopt high-performing AI algorithms into their radiology workflows and promote enhanced population-based outcomes."
The complete review can be found here.