Using an artificial intelligence (AI) algorithm concurrently during interpretation can help radiologists to enhance their diagnostic performance on screening mammography studies, according to research published online November 4 in Radiology: Artificial Intelligence.
In a retrospective study involving 14 radiologists, a team of researchers led by Serena Pacilè, PhD, of AI software developer Therapixel found that concurrent use of their software significantly improved average radiologist sensitivity without negatively impacting their specificity.
"The results of this study suggest also that incorporating AI-based machines into the process of evaluation of mammograms can improve the performance of radiologists," the authors wrote. "An improved diagnostic performance of radiologists in the mammographic detection of breast cancer is achievable without having an impact on their overall reading time."
The researchers sought to evaluate the benefits of version 1 of Therapixel's MammoScreen AI software in enhancing breast cancer detection. The software analyzes the four views of a 2D mammogram and then outputs a set of image positions with a related suspicion score for malignancy. MammoScreen was cleared by the U.S. Food and Drug Administration earlier this year.
In their study, 14 radiologists assessed 240 2D digital mammography images that had been acquired between 2013 and 2016 and included different types of abnormalities. These screening exams included cases that were classified as true positive, false negative, true negative, or false positive based on the radiologist's initial interpretation and subsequent histopathologic evaluation or follow-up. The radiologists were informed that the dataset was enriched with cancer cases.
Half of the exams were read during a first interpretation session without the AI software and the other half with AI, and then vice versa on a second reading session four weeks later. The 14 radiologists in the study had a range of experience from 0 to 25 years, with a median of 8.5 years. In addition, 93% devoted at least half of their practice to breast imaging and read more than 3,000 mammograms per year, according to the researchers.
Impact of AI on radiologists reading screening mammograms | ||
Without AI | With concurrent use of AI | |
Sensitivity | 65.8% | 69.1% |
Specificity | 72.5% | 73.5% |
Area under the curve (AUC) | 0.769 | 0.797 |
The improvement in sensitivity and AUC was statistically significant (p = 0.021, p = 0.035, respectively). The higher specificity from the use of AI did not reach statistical significance, however (p = 0.634).
In other findings, the researchers found that the reading time was affected by the AI software's likelihood of malignancy score. For cases with less than a 2.5% chance of malignancy, reading time was approximately the same in the first reading session when using AI than without. The reduction in reading time could increase overall radiologists' efficiency, allowing them to focus their attention on the more suspicious examinations, according to the researchers.
They also noted, however, that cases with AI scores reflecting a higher likelihood of malignancy resulted in an average increase of reading time. In future work, the researchers plan to assess the ability of the software to detect breast cancer earlier on a large screening-based population.