Monday, November 27 | 10:10 a.m. - 10:20 a.m. | M3-SSIN02-5 | S404
Using natural language processing (NLP) for real-world evidence gleaned from radiology reports could relieve burdens on those with reporting requirements related to postmarket safety of cancer drugs, according to a Monday morning presentation.
A team led by Richard Kinh Gian Do, MD, PhD, of Memorial Sloan Kettering Cancer Center retrospectively utilized NLP to extract cancer progression and response events from 3,503 consecutive radiology reports over a four-month period. Reports were retrieved if they had been prospectively labeled by radiologists with a departmental oncologic response standardized lexicon.
The standardized lexicon comprised multiple categories, including progression of disease (PD) and partial response (PR). Also, no complete response category was used in the lexicon. Furthermore, the researchers used their NLP model to predict the PD and PR report response categories. The NLP model’s performance was measured by accuracy, precision, recall, and F-1 scores on a training/validation/test set split as 80%, 10%, and 10%, respectively. Of the 3,503 reports, the NLP model labeled 772 (22%) as PD, and 385 (11%) PR.
According to Do, accuracies for the first NLP model created to predict PR were > 99%, 97%, and 95% for training, validation, and test sets, respectively. Precision, recall, and F-1 for the PR test set were 98.3%, 90.8%, and 94.4%. For the PD category, accuracies for the second NLP model were > 99%, 96%, and 98% for training, validation, and test sets. Precision, recall, and F-1 for the PD test set were 97.7%, 97.7%, and 97.7%, respectively.
Do reached the conclusion that NLP can be used to extract progression and response events with high F-1 scores of over 90% when applied to radiology reports with free text impression and plans further testing with multicenter reports next. Learn more in this Monday session.