A large language model (LLM) outperforms the radiology lexicon RadLex for expanding terms in radiology reports, according to a study published January 14 in the American Journal of Roentgenology.
The research results suggest that the LLM "can aid real-world natural language processing applications," wrote a team led by Taehee Lee, MD, of Seoul National University Hospital in South Korea.
Medical ontologies -- that is, structured frameworks that organize clinical concepts and their relationships -- are key to standardizing terminology to support clinical and research applications, including clinical decision-support systems, natural language processing tools, and other AI tools, the investigators explained. RadLex was developed by the RSNA in 2005. It offers radiologists a common language to communicate diagnostic results and includes 75,000 terms that can be used for reporting, decision support, data mining, data registries, education, and research. But its coverage of "clinical radiology reports remains limited due to radiologists' linguistic variation," the group wrote, noting that, for example, a pulmonary nodule may be reported as a lung nodule, a nodular opacity, or a solitary pulmonary lesion.
To address this problem, the researchers used an LLM (Gemini 2.0 Flash Thinking) to generate an expanded set of terms and synonyms for RadLex and to evaluate the effect of this expansion on "lexical coverage rate and semantic term recognition" using clinical radiology reports. "Lexical alternatives" included morphologic and orthographic variants, acronyms and abbreviations, and synonyms for 40,000 commonly used RadLex terms. ("Lexical coverage rate" is a measure of the extent to which terms matched a given expression list, while semantic term recognition, or "semantic coverage," measures readers' ability to recall "general knowledge, facts, concepts, and word meanings not tied to personal experiences," the authors explained.)
The group used data from five sets of chest CT reports, two from South Korea, one from Spain, another from Turkey, and a third from the U.S. It calculated the lexical coverage rate and randomly selected 100 reports from each dataset for manual review and comparison to the LLM for precision, recall, and F1 score.
An existing RadLex expansion that was performed without the help of the LLM added 17,515 terms. This LLM-generated expansion added 208,465 additional lexical variants and 69,918 synonyms.
Lee and colleagues reported the following:
Comparison of RadLex term expansion to LLM term expansion across 5 datasets | ||
Measure | RadLex-provided expansion | LLM-generated expansion |
| Lexical coverage rate | 67.5% | 81.9% |
| Semantic recall (coverage) | 64% | 81.6% |
| Semantic precision | 100% | 94.8% |
| F1 score | 0.86 | 0.91 |
"Across multinational datasets of clinical chest CT reports, the LLM-generated term expansion yielded improved lexical coverage and semantic recall, with only small loss of semantic precision, compared with the RadLex-provided expansion," they noted, concluding that "this LLM-based expansion strategy can complement manual ontology refinement and support scalable radiology report standardization."
Access the full article here.




















