Can customized reader pairing boost mammo double reading performance?

Oct 25, 2023

Customized pairing strategies do not significantly improve mammography double reading performance, according to Dutch findings published October 24 in Radiology.

A team led by Jessie Gommers from Radboud University Medical Center in the Netherlands found that pairing a set of readers based on different pairing strategies did not show a significant difference in screening performance when compared with random pairing.

“The specific pairing strategies included some higher-performing pairs as well as lower-performing pairs that together balanced the overall screening performance of the pairing strategies to abnormal interpretation rates and cancer detection rates that were very similar,” the Gommers team wrote.

In Europe, breast cancer screening involves double reading, where two interpreting radiologists review mammography images. Previous research suggests that double reading increases the cancer detection rate when compared with single reading.

However, radiologist pairing in this setting is done randomly, and screening performance varies among radiologists. Random pairing is done out of convenience or to balance workloads.

Gommers and colleagues sought to discover whether radiologist performance characteristics can be used to determine the best pairs of radiologists to double read screening mammograms. They used retrospective data with reading outcomes from breast cancer screening programs in Sweden (2008 to 2015), England (2012 to 2014), and Norway (2004 to 2018).

In total, the team included data from 3,592,414 mammography exams. This included 965,263 exams from Sweden, 837,048 exams from England, and 1,790,103 exams from Norway.

The researchers found that the overall abnormal interpretation rates and cancer detection rates for all specific pairing strategies were not significantly different from random pairing.

Comparison between specific, random breast radiologist pairing
	Random pairing	Specific pairing
Abnormal interpretation rate (Sweden)	45.9-56.9 per 1,000 exams	54.1 per 1,000 exams
Abnormal interpretation rate (England)	68.2-70.5 per 1,000 exams	69.3 per 1,000 exams
Abnormal interpretation rate (Norway)	81.6-88.1 per 1,000 exams	84.1 per 1,000 exams
Cancer detection rate (Sweden)	3.1-3.6 per 1,000 exams	3.3 per 1,000 exams
Cancer detection rate (England)	8.9-9.4 per 1,000 exams	9.1 per 1,000 exams
Cancer detection rate (Norway)	6.1-6.8 per 1,000 exams	6.3 per 1,000 exams

*Not all data achieved statistical significance.

The investigators suggested that their results could be due to several factors, including the underlying criterion not being the individual performance of the readers; inconsistencies across screening programs introducing differences in predicted outcomes; the classification of readers; or the data sets having too few reads per examination.

The authors called for future studies to include data sets with screening exams read by more than two readers, as well as test pairing strategies with a variable number of reader types. They wrote that implementing these strategies could improve overall screening performance with different pairing strategies.

The full study can be found here.