CHICAGO -- A deep-learning model trained and calibrated on international mammography screening data has strong predictive performance, according to research presented November 27 at the RSNA 2023 annual meeting.
In her talk, Christiane Kuhl, MD, PhD, of RWTH Aachen University in Germany presented research indicating that she and her team’s deep-learning model showed high performance in breast cancer risk assessment in a dataset of nearly 130,000 women.
“[The model] can be calibrated to provide five-year risk estimates to support more personalized screening and risk reduction interventions as recommended by clinical practice guidelines,” Kuhl said.
Women deemed to be at increased risk of breast cancer undergo more aggressive screening. Kuhl said that the EU and U.S. focus on three primary approaches in determining a woman’s risk: genetic predisposition, lifetime risk scores dependent on family history, and traditional clinical risk models that rely on patient-reported data and breast density. She also pointed out that these approaches are fundamentally flawed, with traditional risk models only achieving “modest” area under the curve (AUC) values.
Previous studies analyzing AI’s utility in this area have demonstrated the technology’s promise. Kuhl said that AI can improve breast cancer risk assessment.
Kuhl and colleagues sought to train, validate, test, and calibrate their own five-year breast cancer prediction model based on screening mammography images. Their study included imaging data from seven centers around the world as part of the Clairity Research Consortium. These include four U.S. institutions, two European centers, and one South American center.
They included data from 318,101 consecutive bilateral 2D full-field digital screening mammograms obtained from 129,498 cancer-free women between 2007 and 2016. Test datasets that were held out of model development, included 46,104 exams from Europe and the U.S. and 5,888 exams from South America.
The team used a deep convolutional neural network with federated learning to create the deep-learning model. It trained the model to predict the development of breast cancer within five years of the mammogram, based only on the four standard mammography views. It also used a calibration algorithm to create percent probabilities of future cancer from the deep-learning score.
Finally, the researchers combined and segmented held-out test sets into five equally sized bins based on increasing predicted risk and generated observed-to-expected ratios. These bins ranged from “very low risk” to “very high risk.”
They found that area under the curve (AUC) values were high for all centers. These included 0.75 for U.S. centers, 0.8 for European centers, and 0.8 for South American centers.
The researchers also found that the observed-to-expected ratios overall and for the spectrum of predicted low- to high-risk bins demonstrated strong calibration of the model with future cancer occurrence.
Observed-to-expected ratios of held-out test datasets | |
---|---|
Bins | Ratios |
Overall | 1 |
Very low risk | 1.01 |
Low risk | 0.8 |
Moderate | 1.04 |
High risk | 0.99 |
Very high risk | 1.03 |
Kuhl highlighted that the risk model “provides strong predictive accuracy” and is well-calibrated for use in clinical workflows that rely on five-year breast cancer risk predictions.
She added that this makes way for more personalized care and risk reduction strategies for women undergoing breast cancer screening.
However, the team did not compare the model’s performance to those of other models in this area. Along with that, mammography data was gathered from one vendor (Hologic).
Still, Kuhl highlighted that the study includes large datasets from a multinational cohort, strengthening the research.