LLM predicts CPT codes from interventional radiology notes

Will Morton, Associate Editor, AuntMinnie.com. Headshot

An open-source large language model (LLM) can be used to assign current procedural technology (CPT) codes from interventional radiology postprocedure reports, according to researchers. 

In a proof-of-concept study, a group led by Hossam Zaki, a medical student at Brown University in Providence, RI, fine-tuned a model called XLNet to assign CPT codes to reports containing the terms “embolization” and “catheter,” with highly accurate results. 

“[Interventional radiology] can benefit from AI-driven billing models, which may reduce administrative burdens, improve accuracy, increase revenue, and lower litigation risk,” the group wrote. The study was published January 24 in the Journal of Vascular and Interventional Radiology

Translating medical procedures into standardized CPT codes is a critical component of healthcare billing, yet the coding industry remains highly inefficient, the researchers explained. Coding requires an average of 12 minutes per procedure, and studies indicate that up to 80% of medical bills contain errors. 

Interventional radiology procedures pose a particular challenge, as notes are highly detailed, unstructured, variable in style, and often involve multiple codes for a single intervention, the researchers wrote. Thus, in this study, the group hypothesized that AI could offer a solution by improving contextualization of procedural text. 

The researchers first trained XLNet on two sets of reports culled from the Medical Information Mart for Intensive Care (MIMIC-IV) dataset, a large database of information relating to patients admitted to critical care units, including observations and notes charted by care providers. One dataset comprised 1,590 reports with the term “embolization” and 17 associated CPT code labels, and the second included 5,590 reports with the terms “embolization” and “catheter” and 42 associated CPT codes. 

Next, using a portion of the datasets held out for testing, they evaluated the model’s performance at predicting the correct CPT codes from the reports. 

On the embolization dataset, XLNet achieved an area under the receiver operating characteristic (AUROC) score of 0.93, an area under the precision-recall curve (AUPRC) score of 0.86, and an F1 score of 0.84, indicating strong performance. The best-performing code was 37243 (vascular embolization and occlusion procedures), while one of the more challenging codes was 36246 (selective catheter placement in a second-order artery branch), the researchers reported. 

On the combined embolization and catheter dataset, XLNet achieved an AUROC of 0.99, an AUPRC of 0.86, and an F1 score of 0.85. The best-performing codes included 36597 (other central venous access procedures), while the least accurate included 36010 (intravenous vascular introduction and injection procedures). 

“This proof-of-concept study demonstrates that fine-tuned open-source LLMs can accurately predict CPT codes from post-procedure [interventional radiology] reports,” the group wrote. 

While the LLM’s performance was commendable, before moving toward full autonomy, such models should undergo serial testing and staged deployment, the authors noted. Specifically, they should first be used as a confirmatory tool for human-generated codes, then as a primary code generator requiring human confirmation, and only thereafter considered for fully autonomous use, they wrote. 

“This study lays the groundwork for developing additional models related to CPT automation, within [interventional radiology] and beyond,” the group concluded. 

The full study is available here

Page 1 of 187
Next Page