RSNA studies delve deeper into DMIST results

Dec 13, 2006

Several studies presented at the 2006 RSNA meeting in Chicago took a closer look at the Digital Mammographic Imaging Screening Trial (DMIST) results, including the accuracy of full-field digital mammography (FFDM) in dense breasts, the accuracy of soft-copy reading based on FFDM, soft-copy versus hard-copy reading, and the use of the "probably benign" BI-RADS category.

Research on the effectiveness of FFDM in dense breast tissue was presented by Alicia Toledano of Toronto-based firm Biostatistics Consulting, which conducted a retrospective study that looked at the effect of breast density on the diagnostic accuracy of FFDM.

"We've known for many years that the accuracy of FSM decreases with increasing breast density. We also know that the risk of cancer is higher for women with dense breasts than it is for women with fatty breasts," Toledano said in her RSNA talk.

From the larger DMIST pool, readers interpreted FFDM and FSM studies from 320 asymptomatic women. There were 16 readers in all, with half analyzing FSM results and the other half assessing FFDM results. The readers had an average of 15.7 years of experience, with 12 of the 16 having previous experience with digital mammography. Breast density was distributed equally across the four pertinent BI-RADS categories.

Of the 320 women, 109 had breast cancer. Of the latter, 12 had extremely dense breasts, 19 had almost entirely fatty breasts, and 39 had scattered fibroglandular densities or heterogeneously dense breasts. Breast density was classified on FSM, Toledano added.

According to the results, the area under the receiver operating characteristic (ROC) curve decreased as breast density increased. For FFDM, the average ROC was 0.89 in women with almost entirely fatty breasts versus 0.90 for FSM in the same group. In women with scattered fibroglandular densities, the ROC was 0.79 for FFDM and 0.78 for FSM. For those with heterogeneously dense breasts, the ROC was 0.84 for FFDM and FSM. Finally, for women with extremely dense breasts, the ROC was 0.58 for FFDM and 0.63 for FSM.

Overall, in FFDM, there was a 0.34 decrease in ROC values between women with almost entirely fatty breasts and extremely dense breasts. For FSM, the decrease was 0.27. As breast density increased, sensitivity decreased, while specificity remained consistent.

An RSNA attendee noted that the results of this study seem to contradict the overall DMIST data, which found FFDM was more accurate than FSM for detecting cancers in radiographically dense breasts. Toledano said that DMIST investigators hope that a study currently under way by lead DMIST investigator Dr. Etta Pisano will address this issue.

"There are small numbers of women with cancer in these extreme categories," Toledano said. "We are trying to find out if it has to do with late case selection or if it has to do with not having prior films. We're trying to look at everything that we can with this retrospective reader study and figure out what's going on."

Dr. Daniel Kopans from Massachusetts General Hospital in Boston emphasized that observer variability could be a big influence. "Reader variability in breast imaging is huge, and we really need to sit down with a digital mammogram and a film-screen mammogram side by side," he said. "My suspicion is that we are going to see other cancers in both but were missed by the reader on one or the other."

Soft-copy reading

DMIST involved nearly 50,000 asymptomatic women who presented for screening mammography at 33 sites in the U.S. and Canada. Five different FFDM systems were used:

SenoScan (Fischer Imaging, Denver)
FCRm (Fujifilm Medical Systems USA, Stamford, CT)
Senographe 2000D (GE Healthcare, Chalfont St. Giles, U.K.)
Selenia Full Field Digital Mammography System (Hologic, Bedford, MA)
Lorad/Trex Digital Mammography System (Hologic), subsequently replaced by Lorad/Hologic Selenia Full Field Digital Mammography System (Hologic)

In a second RSNA talk, R. Edward Hendrick, Ph.D., from Northwestern University in Chicago, described a multireader study comparing the accuracy of soft-copy digital interpretation to that of FSM by digital manufacturer.

"This is a different reader study than (Pisano) was talking about earlier that is ongoing," Hendrick said. "This reader study was designed as part of the original DMIST proposal and was carried out shortly after the DMIST accrual."

In addition, this study was a retrospective study with a smaller number of cancer-enriched cases. Also, there were more readers (six to 12 versus 10 for the ongoing study), and the same cases were interpreted with both FFDM and FSM. All FFDM images were soft-copy, and the readers did not have access to prior films or patient history.

The readers attended two randomly ordered reading sessions (one for FFDM, one for FSM) held six weeks apart. Each manufacturer's recommended soft-copy display system was used. The readers identified suspicious findings, and ranked the degree of suspicion of breast cancer in those lesions on a seven-point scale.

According to the results, there was no statistically significant difference in the ROC curve area, sensitivity, or specificity between soft-copy FFDM reads and FSM reads. Overall, no FFDM system was superior to film. While the results of the larger study found that FFDM was better in younger women and women with dense breasts, no single system stood out in those two subgroups, Hendrick said, although he added that possible reasons could be too few cases and too few cancers in these women.

On a similar topic, Eloida Cole and colleagues compared soft-copy and hard-copy reading of digital mammograms to compare performance in detecting breast cancer. Cole is from the Biomedical Research Imaging Center at the University of North Carolina in Chapel Hill.

"Is there an advantage to viewing images on a monitor versus printing digital to film?" Cole asked.

For this study, 30 radiologists read FFDM exams that were displayed on high-resolution monitors or printed on film. From the DMIST pool, 333 cases were selected and 117 were from women who were diagnosed with breast cancer within 15 months of screening. Again, case sets were read six weeks apart and radiologists used a seven-point scale (1 equaled definitely not malignant; 7 meant definitely malignant).

According to these results, there were no significant differences in the areas under the ROC curve (AUC) for primary comparison. The AUC was 0.75 for soft-copy reading and 0.76 for hard-copy reading. There also were no differences in AUC based on type of machine, lesion type, or breast density.

"Soft-copy reading does not provide an advantage in the interpretation of digital mammograms," the group concluded.

Anecdotally, Cole said that it did take the participants longer to read soft-copy films than hard-copy ones, which may be partially attributed to the manipulation and adjustment of soft-copy images. Hendrick referred the audience to an article in the American Journal of Roentgenology that compared FFDM image acquisition and interpretation times to FSM (AJR, July 2006, Vol. 187:1, pp. 38-41).

On a related note, Pisano noted that the specificity in this study was about 83% versus a specificity of 92% achieved in the larger study. She added that this shift could very well be caused by the fact that participants behaved differently in this reader study than they did in the general clinical trial.

Probably benign

In the final presentation, Dr. Janet Baum, from Beth Israel Deaconess Medical Center in Boston, discussed how frequently the BI-RADS 3 category was used by DMIST readers. Her group also looked at the rate of short interval follow-up (SIFU), the types of lesions referred for SIFU, and the incidence of malignancy on FFDM and FSM.

The complete data from the DMIST trial was analyzed with 178 radiologists interpreting FFDM exams using the BI-RADS lexicon. Of the 1,138 subjects referred for SIFU, 104 were sent directly from the original screening exam. No cancers were found in this subset of patients. The remaining 1,034 were referred after additional imaging, and 70% (802 subjects) returned for SIFU.

Of the 1,138 SIFU subjects, 903 had a single lesion, 347 had calcified clusters, 358 had masses, and 197 were classified as having all other lesions. In this latter group, a dozen cancers were found (six ductal carcinoma in situ and six T1 invasive tumors).

Overall, 2.3% of the DMIST participants were given a BI-RADS 3 rating, and SIFU demonstrated a malignancy rate of 1.05% with cancers found at lower stages.

The authors drew two major conclusions. First, that the DMIST readers did use BI-RADS 3 correctly. However, at 70%, patient compliance for SIFU was low. For those who were slated for follow-up at one year, the compliance rate was 89.9%, Baum said.

"One thing to remember about that follow-up rate is that these are patients who were interested enough to be in a clinical trial … and they still didn't come back," she said. "One of the implications is how to increase patient compliance, particularly for screening, and what role do radiologists play in increasing compliance in the future?"

Future DMIST studies will focus on the part that observer variability plays in FFDM's success. To that end, a study is under way that specifically tracks reader habits and skills in FFDM versus FSM, according to Pisano.

By Shalmali Pal
AuntMinnie.com staff writer
December 14, 2006

Do DMIST results underestimate FFDM's impact? October 24, 2005

Harnessing technology, training to make the most of FFDM, October 3, 2005

DMIST study: Younger women may benefit most from digital mammo, September 16, 2005