As breast cancer screening imaging volumes rise, radiologists and imaging center administrators may be weighing how to justify and incorporate AI into the clinical workflow and information systems -- for a thoughtful approach that does not risk diagnostic performance.
Computer-aided triage (CADt) shows promise for helping busy radiology practices to identify and prioritize screening mammograms more likely to contain breast cancer. In the future, these types of software could even potentially be utilized to screen out and report normal cases, avoiding the need for radiologist review.
But hurdles still remain.
As breast cancer screening imaging volumes rise, radiologists and imaging center administrators may be weighing how to justify and incorporate AI into the clinical workflow and information systems -- for a thoughtful approach that does not risk diagnostic performance.
Computer-aided triage (CADt) shows promise for helping busy radiology practices to identify and prioritize screening mammograms more likely to contain breast cancer. In the future, these types of software could even potentially be utilized to screen out and report normal cases, avoiding the need for radiologist review.
But hurdles still remain.
AI landscape in mammography
Mohammad Elhakim, MD, PhD, at the Research and Innovation Unit of Radiology at Odense University Hospital in Denmark, explained that knowing the different intended uses of FDA-cleared AI devices is key for understanding the landscape of AI in mammography. There are five types:
- CADt for triage where the AI flags suspicious cases for prioritized review
- CADe for detection where the AI marks suspicious regions to aid in detection
- CADx where AI is used for diagnosis; the AI scores and categorizes to aid in diagnosis
- CADe/x for detection and diagnosis where the AI marks and scores or categorizes to aid in detection and diagnosis
- CADa for autonomous, where the AI automatically interprets the image without clinician input
These five CAD types vary according to their outputs and place within the clinical workflow, according to Harvard Medical School and University of Maryland researchers, who created a curated database of FDA-cleared AI devices for medical image interpretation.
Breast cancer ranks among the top two conditions with the most CAD products and CADt, CADe, CADx, and CADe/x are all represented in the approved category. A new category -- CADd â is also emerging to assist in the assessment of breast density.
"The use case of AI ruling out normals in place of the radiologist would require bypassing the human readers completely and would require a CADa approval, which none of the current solutions for mammography has received as far as I know," Elhakim told AuntMinnie.com via email.Â
"Worklist prioritization is considered as a CADt application, and that's the least problematic use case in screening mammography which is already implemented many places across the world," Elhakim said.
Setting thresholds
Connie Lehman, MD, PhD, a professor of radiology at Harvard Medical School and a breast imaging specialist at Mass General Brigham in Boston, said several groups across the U.S. are using CAD triage products to sort screening mammograms.
"We're just at the brink of realizing the potential benefits of these CAD triage tools in screening mammography," Lehman said in an interview with AuntMinnie.com. However, there is reason for debate.
"Just the idea that radiologists would read batches of screening mammograms differently with computer assistance than without has raised a lot of debate," Lehman explained. "If you set your threshold of the triage at a level that over time the radiologist says, 'I barely look at the low-risk cases,' is that safe? Do we know the right threshold to set that is safe for this scenario? We're going to have to study that rapidly and carefully. That will be critical for post-market implementation studies," she said.
"But do we see a future where we have mammograms read independently without human eyes on them? I think we have to say that the answer is yes," Lehman added. "Yes, of course we will have a point where there are screening mammograms that can be cleared by a computer without the need for a human to evaluate every mammogram."
The timing of such a shift is not clear, Lehman said, but making the transition will involve careful research and careful monitoring of performance and outcomes. Lehman pointed to domains where the transition has occurred -- such as primary screening of Pap smears and of images of the retina by AI software -- and the quality assurance programs to support those transitions. These AI-supported paradigm shifts made it feasible to reduce the human effort needed to process these medical images and thus supported increased global access to screening for cervical cancer and for diabetic retinopathy.
When it comes to CADt, there are a lot of ways to use the technology, Lehman said.
"The purpose of the triage software is to sort more complex or suspicious cases from the least suspicious cases,â Lehman said. âHow to use this to help the radiologist and patients can vary.â
For example, some radiology practices put the exams triaged with more suspicious areas at the top of their list to read first, so that they can review exams that might benefit from additional imaging before the patient leaves the center, Lehman explained. Others might read the more challenging cases when they are most fresh in the morning and then review the cases âclearedâ as not having suspicious areas later in the day. Some centers may have cases flagged as most suspicious read by a second radiologist if the first radiologist doesnât see a reason to recall the patient, Lehman added.
Elhakim's study
Recently, a retrospective Danish study evaluated three different AI-integrated breast cancer screening scenarios and compared them with the standard approach in Europe for radiologist double reading with arbitration or combined reading.
Led by Elhakim, this research emphasized that replacing one or both radiologists in double-read mammography screening with AI seemed feasible. However, there were important differences in accuracy and workload depending on the deployment site in the workflow.
Elhakim's study included a large-scale cohort (249,402 screenings in 149,495 female individuals at a mean age of 59.3 years) from population-wide mammography screening over four years from August 4, 2014, to August 15, 2018. Three simulated AI-integrated screening programs were tested.
In scenario 1, labeled integrated AIfirst, the AI system replaced the first reader with a validated threshold matched at mean first reader specificity from the study sample (AIspec), corresponding to an AI score of 80.99%. Outcomes above the threshold were considered as recalls.
For scenario 2, labeled integrated AIsecond, the AI system replaced the second reader with the same threshold and approach to arbitrations as described for scenario 1.
In scenario 3, labeled integrated AItriage, AI became a stand-alone tool to triage screenings as low risk, moderate risk, and high risk. The AI system replaced both readers in the assessment of low-risk and high-risk screenings that were sent to no recall and recall, respectively, while the original combined reading was applied for moderate-risk cases.
Among the key points here, AI reduced the volume of screening reads by 49.7% in the third scenario and by 48.8% with replacing the first reader fully, without reducing cancer detection accuracy. Scenario 2 reduced the volume of screening reads by 48.7% and the number of recalls by 2.2%, but at a cost of a reduced sensitivity (â1.5%, p < 0.001).
"AI-integrated mammography screening was feasible across all scenarios in terms of workload reduction," wrote Elhakim and colleagues for a September 4 paper in Radiology.
Notable though was the point that variable reader performance both before and after AI implementation should be considered in workflow adaptations and prior to selecting the appropriate AI-integrated screening setup.
"A triage-based approach was found to have the best outcome across all performance metrics," the team said, although prospective trials are warranted to validate these scenarios.
Hurdles
"The main hurdle regarding using AI as a CADa to fully replace the human readers (for instance both readers as in Scenario 3 in our article, even in just some cases) would be the lack of regulatory approval first and foremost, and second the lacking prospective evidence," Elhakim told AuntMinnie.com.
"If regulatory approval is obtained for AI as CADa, then the evidence would follow (especially prospective), and we would see a change in guidelines, or at least in clinical practice, such as we have seen in Denmark and Sweden, where they have implemented AI as CADe/x without a change of guidelines (European, not local or national)," Elhakim explained.
In double-reading settings, where AI is fully integrated in clinical screening practice, AI is not being used more than as an autonomous reader fully or partially replacing one of the two readers (such as in Denmark, Sweden, and other countries), Elhakim added.
Lehman said another major hurdle is reimbursement.
"A major barrier to [integrating] AI into clinical practice is the absence of specific CPT codes for AI-driven services," Lehman said. "For the older CAD products, those payments are now bundled into the interpretation of the mammogram. With the newer products, centers are struggling to adopt new technology that carries a cost and expense when there isn't added reimbursement for those tools. The same thing with CAD triage, without reimbursement, it's challenging to figure out how the technology becomes available to the full spectrum of patients who may benefit."