An artificial intelligence (AI) model based on routine data from F-18 FDG-PET/CT scans can improve accuracy when staging treatment for patients with metastatic lung cancer, according to a group in Germany.
A team of researchers led by Dr. Julian Rogasch of the Charité-Universitatsmedizin Berlin developed a machine learning tool to help assess mediastinal lymph node metastases in patients with non-small cell lung cancer (NSCLC). They found the model was more accurate than visual PET scores and suggest it could help doctors plan more accurate treatments. Their results were published February 23 in the European Journal of Nuclear Medicine and Molecular Imaging and the model is available as a web application to further facilitate validation, they wrote.
"The model developed here is intended for use in routine clinical care to estimate the probability of N2/3 disease based on F-18 FDG-PET/CT findings," Rogasch and colleagues wrote. "It is based on routinely available variables, the majority of which are already part of F-18 FDG-PET/CT reporting in routine clinical care."
In patients with NSCLC, accurate imaging of thoracic lymph node metastases is essential for treatment planning. Patients with disease classified as N0/1 are usually referred to surgery while those with more widespread N2/3 disease require multimodal treatment, the authors explained. While F-18 FDG-PET/CT is the most effective approach, it can be limited by false-positive findings, and invasive biopsies are often required to confirm suspicious cases, they noted.
To evaluate the value of an AI model designed to assist in these cases, the researchers collected data on 491 NSCLC patients who underwent pretherapy imaging using an analog PET/CT scanner or a digital PET scanner. Patients were split into training and validation cohorts based on the scanner used, with 385 in the analog training and test cohort and 106 in the digital validation cohort.
Forty clinical variables, tumor characteristics, and image variables, such as lymph node F-18 FDG radiotracer uptake (SUVmax) were collected, with different combinations of the variables compared among several machine learning methods. The researchers selected a gradient-boosting classifier with 10 features as the final model.
The gradient boosting classifier model showed significantly higher AUC than the visual PET score in the training and test cohort (0.91 vs. 0.87, p = 0.003) and slightly higher AUC in the validation cohort (0.94 vs. 0.91, p = 0.23).
Notably, the gradient boosting classifier model was superior to visual assessment of lymph node radiotracer uptake above the mediastinum, which is the most common and validated diagnostic threshold in visual F-18 FDG-PET/CT assessment, the authors wrote. They also reported that harmonization of PET images between the two scanners did affect SUVmax and visual assessment of the lymph nodes but did not diminish the AUC of the model.
To facilitate further validation, the group has published all variables and the machine-learning code as open data, with a web application available for use on GitHub. The web app takes user input of 10 features and calculates the predicted probability of N2/3 disease.
The study findings are promising and warrant further research, according to Rogasch and colleagues.
"External validation and proof of its validity in an interventional trial are pending," they concluded.