ACRIN virtual colonoscopy training tied to performance

Oct 26, 2008

An important difference between the National CT Colonography Trial and earlier multicenter virtual colonoscopy (VC) studies was its intensive focus on reader training. Another distinguishing factor was performance: ACRIN (American College of Radiology Imaging Network) 6664 readers achieved a mean per-patient sensitivity of 90% for detecting colorectal adenomas 1 cm or larger, an improvement over most earlier efforts.

But was it training and testing that made ACRIN readers better? Or did the improvements result from methodological and technical improvements over earlier studies, such as thin-section CT image acquisition?

In the first of several upcoming studies that will analyze the recently published ACRIN 6664 trial data to answer new questions, researchers from the Mayo Clinic, along with ACRIN investigators, examined the role of training and testing in performance. They found that training improved overall reader sensitivity in predictable, quantifiable ways -- and that novice readers could be trained to perform as well as their more experienced colleagues.

The study was led by Dr. Joel Fletcher and colleagues from the Mayo Clinic in Rochester, MN. At the ACRIN fall meeting earlier this month, biostatistician Benjamin Herman from Brown University in Providence, RI, presented the preliminary study results on behalf of ACRIN, which coordinated the statistical analysis.

"Several studies have shown that as readers gain experience, their sensitivity improves; in an ACRIN retrospective [CT colonography (CTC)] study, a similar trend was observed," Herman said. However, inter-reader variability was significant, he said, demonstrating that experience does not guarantee performance. Inter-reader variability was also significant in VC trials by Rocky et al (Lancet, January 15, 2005, Vol. 365:9455, pp. 305-331) and Cotton et al (JAMA, April 14, 2004, Vol. 291:14, pp. 1713-1719).

"In the National CT Colonography Trial [NEJM, September 18, 2008, Vol. 359:12, pp. 1208-1217], we attempted to limit reader variability through training and testing to ensure that all readers met a minimum standard," Herman said.

2008 10 24 15 06 41 836 Acrin Times Square Resized

CTC basks in fame: On September 18, 2008, an electronic billboard in Times Square, New York City, heralds the results of the National CT Colonography Trial (ACRIN 6664). Image courtesy of ACRIN.

For the prospective National CT Colonography Trial, readers were divided into two categories for training and testing: experienced readers (n = 4) who had evaluated more than 500 cases with endoscopic correlation and inexperienced readers (n = 11) who had read fewer than 500 cases. "Although both groups were invited to attend the training session, only the inexperienced readers were required to attend," he said.

Both reader groups were required to pass a qualifying test to participate in the trial. Initial training was aimed at demonstrating the spectrum and appearance of colorectal neoplasia, familiarizing radiologists with the workstation, teaching interactive problem-solving skills such as the correlation of prone and supine images, and teaching the use of different reconstruction techniques.

After training, each reader had to pass a qualifying test consisting of 20 VC cases -- 17 positive and three negative -- as evaluated by two experienced gastrointestinal radiologists working in consensus and with endoscopic correlation of the results, Herman said. The 17 positive cases had a total of 25 polyps (n = 10 5-9 mm, n = 15 > 10 mm), divided into easy (n = 13), moderate (n = 7), and difficult-to-detect (n = 5) lesions.

"In order to participate in the study, each reader was required to miss no more than two of 20 easy-to-moderate lesions, a sensitivity of 90%," he said. "Difficult-to-detect polyps were not used for qualification purposes, but were used for educational purposes and included in the computation of sensitivity for comparison with the prospective study," he explained.

Readers took the qualifying tests at their home institutions, working at their normal pace and using their normal workstations. The results, which were scored centrally, showed that eight of 15 readers (one experienced, seven inexperienced) needed additional training.

"All lesions missed by readers on the first qualifying test were reviewed to determine what characteristics contributed to the lack of detection," Herman said. Based on these results, the group prepared a new set of 30 cases containing the frequently missed characteristics. The readers were required to examine the entire dataset to find these lesions. After this review, readers were unblinded to the true lesion location. Each trainee then retested on eight cases, including all the cases they missed plus additional cases, and sensitivity was recomputed. There would have been more cases but for a lack of time and funding, Herman said.

"All [n = 15] readers passed this qualifying exam and took part in the prospective study," he said. Of the four experienced readers, only one failed to meet the minimum requirement initially and needed to be retested and retrained, while seven of 11 inexperienced readers needed retesting and retraining." Of the four inexperienced readers who passed the first exam, two opted to attend the second reader training although they were not required to do so.

On the first qualifying test, reader sensitivity fell off rapidly between the easy and the hard-to-detect lesions. The average reader sensitivity was 70% overall, including 92% (± 9%) for easy-to-detect polyps, 76% (± 12.3%) for moderate polyps, and 46% (± 26%) for difficult-to-detect lesions, with fairly constant results across readers, Herman said. The results improved significantly after training. For all lesions including easy, moderate, and difficult, sensitivity rose from 70% initially to 88% after training, he said.

"On the final qualifying test, there was no statistically significant difference between the sensitivity of passers who passed with 85% sensitivity and the retrainees with 88%," (p = 0.004) Herman said. "More importantly, in the prospective study, no significant difference was found between the sensitivity of passers [88%] and retrainees" (92%, p = 0.612).

Finally, in an effort to determine how sensitivity changed as readers read more cases, the researchers modeled the probability of identifying disease as a function of total VC reading experience, and as a function of training plus study experience. These results showed that for every 50 cases in additional reading experience, the odds of correctly identifying disease cases increased by a factor of 1.5 (p = 0.025). For every 50 cases read in addition to training, the odds of correctly identifying a positive case also increased by a factor of 1.5.

"We found there were no shortcuts to CTC training," Herman said. "Training programs should employ 2D or 3D primary reading of at least 45 cases with endoscopic correlation -- by doing so, you reduce the inter-reader variability. We found that qualification testing can be used to predict performance in CTC, and that training and/or experience increases the probability of finding patients with polyps. In summary, the preliminary results demonstrate that training and testing were effective tools that helped ensure high sensitivity in the National CT Colonography study."

By Eric Barnes
AuntMinnie.com staff writer
October 27, 2008

VC training course expands with teleradiology, July 21, 2008

Radiographers perform VC with CAD -- and controversy, June 24, 2008

VC CAD helps junior readers catch up, May 4, 2006

Training key to VC performance, but how much is anyone's guess, February 8, 2005