Of the three guidelines available for assistance in managing thyroid nodules, the Thyroid Imaging Reporting and Data System (TI-RADS) guidelines may offer the best diagnostic yield for thyroid cancer, according to research from the University of California, Los Angeles (UCLA).
In a retrospective study of nearly 900 patients with more than 1,000 thyroid nodules, the UCLA Medical Center research team found that TI-RADS guidelines based on sonographic suspicion of cancer risk outperformed similar guidelines from the American Thyroid Association (ATA) and the Society of Radiologists in Ultrasound (SRU).
A strategy that would have biopsied nodules categorized by sonographic suspicion in TI-RADS categories 3 (indeterminate) to 5 (almost certainly cancer) offered the highest specificity, positive predictive value, negative predictive value, accuracy, and area under the receiver operator characteristics (ROC) curve. The ATA guidelines did yield slightly higher sensitivity, however.
The study "highlights the need for creation of consensus guidelines that can be agreed upon between endocrinologists, endocrine surgeons, and radiologists in order to inform clinical management," said Dr. Hannah Chung from Ronald Reagan UCLA Medical Center.
She shared the findings during a presentation at the recent American Roentgen Ray Society (ARRS) annual meeting in Los Angeles.
An infrequent cancer
While thyroid nodules are common, they aren't often malignant; studies in the literature report that the overall incidence of cancer in patients selected for fine-needle aspiration (FNA) biopsy ranges from 9.2% to 13%, Chung said.
Three management guidelines have been proposed for managing thyroid nodules, including the SRU's 2005 guidelines which recommend thresholds for FNA biopsies. In 2006, the ATA released its own guidelines, which were updated in 2015.
The TI-RADS model was initially proposed in 2009 by Dr. Eleonora Horvath of the Clínica Alemana de Santiago in Santiago, Chile, and colleagues. In 2011, a team led by Dr. Jin Young Kwak from Yonsei University College of Medicine in Seoul, South Korea, proposed a modified TI-RADS based on sonographic suspicion of nodule cancer risk.
Most published research reflects the experience of institutions outside of the U.S., and there are no studies that have compared all three guidelines, Chung said. As a result, the researchers used FNA biopsy results of suspicious thyroid nodules to retrospectively compare the diagnostic performance of the SRU guidelines, the 2015 ATA guidelines, and the modified TI-RADS guidelines for detecting thyroid cancer.
The team retrospectively studied 888 consecutive patients with 1,059 thyroid nodules who had received ultrasound-guided FNA biopsy between July 2014 and August 2015. The patients had an average age of 57 (range: 47-66 years), and 699 (78.7%) were female. The median nodule size was 20 mm (interquartile range [IQR]: 13-30 mm). Twelve (1.1%) of the biopsies were considered to be inadequate or nondiagnostic.
Of the 1,059 nodules, 933 were benign. The remaining 126 (11.9%) included 67 (6.3%) papillary carcinomas, 34 (3.2%) follicular carcinomas, and 25 (2.4%) medullary carcinomas.
The researchers first assessed how the SRU and ATA guidelines would have performed in correlation with FNA biopsy results. The ATA guidelines classify nodules into one of five sonographic patterns -- high suspicion, intermediate suspicion, low suspicion, very low suspicion, and benign -- based on the nodule's features on ultrasound. Nodule size was also used as an adjunctive criterion for recommending FNA biopsy.
The SRU guidelines also recommend FNA biopsy based on a combination of sonographic features, changes in appearance over successive exams, and size.
"It should be noted that the size thresholds in the SRU guidelines were higher than those for the ATA guidelines for similar sonographic features," Chung said. "The authors acknowledge that there may be some missed malignancies, but that these may be of lower clinical relevance due to their size and surveillance was an option in the interest of cost-effectiveness."
Modified TI-RADS score
Next, four UCLA radiologists experienced in interpreting thyroid ultrasound used modified TI-RADS characteristics proposed in 2011 by Kwak et al in Radiology to provide a Likert scale score of 1 to 5 based on sonographic suspicion: 1 (definitely benign), 2 (likely benign), 3 (indeterminate), 4 (likely cancer), and 5 (almost certainly cancer).
These modified TI-RADS guidelines stratify nodule cancer risk based on the number of ultrasound features that had been shown to have a statistically significant association with malignancy: solid component, hypoechogenicity or marked hypoechogenicity, microlobulated or irregular margins, microcalcifications, and a taller-than-wide shape.
The number of these suspicious features would be reflected in the TI-RADS category assignment:
- Category 3: No suspicious ultrasound features
- Category 4a: One suspicious feature
- Category 4b: Two suspicious ultrasound features
- Category 4c: Three or four suspicious ultrasound features
- Category 5: All five suspicious ultrasound features
Based on the FNA biopsy results, the team calculated the sensitivity, specificity, positive predictive value, negative predictive value, and ROC curve for each of the three guidelines.
Of the biopsied nodules, 61% would have met the SRU criteria, 58% would have met the ATA criteria, and 39% would have been assigned a TI-RADS-based Likert score of 3 to 5 based on sonographic suspicion. The modified TI-RADS guidelines would have produced the highest diagnostic yield (i.e., the percentage of malignances from the total number of biopsies):
- Diagnostic yield for biopsied nodules assigned a Likert score of 3 to 5: 21.8%
- Diagnostic yield for biopsied nodules meeting any of the ATA criteria: 15.6%
- Diagnostic yield for biopsied nodules meeting any of the SRU criteria: 10.7%
Highest accuracy
The ROC analysis showed that TI-RADS-based categories of 3 to 5 performed the best, followed by the ATA guidelines and the SRU guidelines. The area under the curve was 0.74 for TI-RADS, followed by 0.66 for the ATA guidelines and 0.41 for the SRU guidelines.
The differences were statistically significant (p < 0.001). ROC analysis evaluates the performance of binary classifiers; an area under the curve of 1.0 represents an optimal classifier, while an area under the curve of 0.5 indicates a random classifier, Chung said.
TI-RADS-based categories of 3 to 5 produced the highest specificity, positive predictive value, negative predictive value, and accuracy, she said.
Diagnostic performance by guideline type | |||
SRU guidelines | ATA guidelines | TI-RADS score 3-5 | |
Sensitivity | 69/126 (54.7%) | 96/126 (76.2%) | 90/126 (71.4%) |
Specificity | 356/933 (38.1%) | 413/933 (44.3%) | 610/933 (65.4%) |
Positive predictive value | 69/646 (10.7%) | 96/616 (15.6%) | 90/413 (21.8%) |
Negative predictive value | 356/413 (86.2%) | 413/443 (93.2%) | 610/646 (94.4%) |
Accuracy | 425/1,059 (40.1%) | 509/1,059 (48%) | 700/1,059 (66.1%) |
Chung noted, though, that the ATA guidelines produced the highest sensitivity and also had a high negative predictive value.
Size may not matter
In other notable results, the researchers found an inverse relationship between nodule size and malignancy. Both the SRU and the ATA guidelines include nodule size among their criteria for FNA biopsy, Chung said.
"We actually found that there was a higher yield of malignancy in nodules that were smaller," she said.
Diagnostic yield by nodule size | |||
< 10 mm (96 nodules) | 10-14 mm (213 nodules) | ≥ 15 mm (729 nodules) | |
Benign | 77 (77.1%) | 183 (85.9%) | 660 (90.5%) |
Malignant | 22 (22.9%) | 30 (14.1%) | 69 (9.5%) |
These differences were statistically significant (p < 0.001).
"This just goes to suggest that size may be not be as important of a discriminator as previously thought in predicting histological malignancy," she said.
The researchers acknowledged limitations of their study, including its retrospective nature and referral bias; the study population included all patients referred for FNA biopsy. In addition, the study relied upon diagnosis based on FNA biopsy rather than core needle biopsy or surgical pathology.
Also, TI-RADS nodule scores were based on sonographer suspicion and radiologist experience. There was a lack of standardization across readers, but this reflects actual clinical practice, Chung said.
Nonetheless, the study results support the use of the TI-RADS sonographic features proposed by Kwak et al in 2011, and also highlight the need for concordance between the various societal guidelines for managing thyroid nodules, she said.