Ultrasound-based deep learning models aid thyroid nodule prediction

Nov 25, 2024

A deep learning-based framework incorporating ultrasound images and clinical data can help in thyroid nodule prediction, according to research published November 25 in the Journal of Radiation Research and Applied Sciences.

Investigators led by Jing Li, MD, from Zhongshan Hospital in Shanghai, China, reported that their deep-learning models achieved high marks in predicting thyroid nodule malignancy. They further highlighted that the model could suggest appropriate next steps for care, such as whether to proceed with a biopsy or continue monitoring nodules.

“By reducing false positives and enhancing specificity, our model provides a scalable and reliable decision support tool for personalized thyroid nodule management,” the Li team wrote.

Radiology researchers continue to explore how machine-learning techniques can address current limitations in risk stratification and disease prediction. TI-RADS, which offers structured criteria for assessing thyroid nodules, is susceptible to interobserver variability, the researchers noted. This leads to inconsistent clinical decision-making.

Recent studies suggest that machine learning models trained on both radiological and clinical data may be more accurate than standardized criteria such as TI-RADS.

With this in mind, Li and colleagues developed their deep learning models, which combined clinical and radiological features with deep imaging features extracted from ultrasound scans using EfficientNet-B0. The latter is a convolutional neural network (CNN) that is trained on more than one million images from the ImageNet database.

The team developed predictive models using XGBoost, random forest, and support vector machines. These resulted in clinical-only, imaging-only, and hybrid models. Additionally, the team employed a stacking-based meta-model to integrate predictions from all three models to boost performance.

The study included 580 patients with thyroid nodules categorized from TI-RADS 2 to 5. The researchers reported that the clinical models achieved strong predictive ability.

Performance of deep-learning models on thyroid module prediction
Measure	Support machine vector	Random forest	XGBoost
Accuracy	75%	78%	81%
Area under the curve-receiver operating characteristic (AUC-ROC)	0.81	0.83	0.85
F1 score	0.78	0.8	0.82

These results suggest that clinical features like TI-RADS category, nodule size, echogenicity, and margins are effective in predicting malignancy, the researchers highlighted.

Additionally, imaging models showed the value of deep feature extraction, reaching up to 79% accuracy and 0.83 AUC-ROC. The hybrid model meanwhile achieved 85% accuracy and 0.87 AUC-ROC with XGBoost.

Finally, the stacking ensemble model achieved 87% accuracy, an F1 score of 0.87, and 0.9 AUC-ROC. The team wrote that this shows the benefits of multimodal feature integration.

Future work will focus on external validation, optimizing the stacking ensemble framework, and exploring additional feature selection techniques, the study authors wrote.

“The model's clinical utility can also be improved by developing a user-friendly interface to support real-time decision-making by clinicians,” they added. “The stacking-based framework offers a scalable, reproducible tool to support more accurate clinical decision-making, reduce unnecessary biopsies, and improve diagnostic precision.

The full study can be found here.