ChatGPT can help to improve radiology reporting and potentially enhance patient engagement, but human oversight is needed to reduce the risk of patient harm, according to an article published April 5 in the American Journal of Roentgenology.
Radiologists at the University of Alabama at Birmingham, AL, wrote that large language models (LLMs) like ChatGPT can help automate radiology reports and answer patients' questions about findings in those reports. But these algorithms can be error-prone -- and this is partly due to their ability to learn the nuances of language and generate human-like responses.
"LLMs also have weaknesses and challenges, such as biases, data privacy concerns, lack of interpretability, susceptibility to adversarial attacks, and ethical considerations," wrote Drs. Asser Elkassem and Andrew Smith, PhD, of the school's Marnix E. Heersink Institute for Biomedical Innovation.
The authors cited an example of an LLM model developed by Meta called Galactica. The model was trained on 48 million science papers and designed to reason about scientific knowledge, but it was shut down last year after only two days in the public domain due to its frequent use of racist comments.
The risk of these "errors" may increase due to LLMs not yet being fully optimized for medical information, according to the authors. Subtraining algorithms like ChatGPT -- specifically including large volumes of radiology reports and other medical text -- could enhance their proficiency in generating accurate responses, they suggested. This could also mitigate the prospect of ChatGPT generating responses that incorporate medical misinformation derived from internet sources, they wrote.
In the meantime, the authors encouraged oversight of patient use of the ChatGPT to validate summaries generated by the chatbot, notwithstanding that this may diminish any potential gains in efficiency.
"We expect that future models will have improved performance, specifically for healthcare," they wrote.
Nonetheless, the authors gave ChatGPT a green light as an aid for radiologists in generating reports, which are often summarized in the form of a conclusion or impression. ChatCPT could be used to expedite and improve this process, they wrote.
In their academic radiology practice, for instance, they said they use a similar program called Rad AI Omni (Rad AI) to generate report impressions based on the text in the current report. The approach appears to improve efficiency, especially in the presence of interruptions to the image interpretation workflow, according to the authors. Moreover, radiologists can optionally use the program based on personal preference.
By comparison, ChatGPT could summarize the current report and include information from prior reports or the electronic medical record to provide longitudinal information and improve context. This could further enhance reporting accuracy and efficiency while reducing omissions and cognitive load, Elkassem and Smith suggested.
"Of course, the use of LLMs like ChatGPT for complicated tasks could introduce errors and bias, and the radiologist would need to validate the report summaries," they concluded.