ChatGPT makes echo reports readable for patients

Jul 31, 2024

ChatGPT may be on its way toward explaining echocardiography results to patients, according to research published July 31 in JACC: Cardiovascular Imaging.

A team led by Lior Jankelson, MD, PhD, from New York University found that about three-quarters of AI-generated responses in the study were deemed suitable by echocardiographers to send to patients.

“These results suggest that generative AI can produce accurate and comprehensive explanations of echocardiogram reports, necessitating clinician editing in a minority of cases,” Jankelson and colleagues wrote.

Imaging reports such as those from echocardiography exams can be technical and difficult to understand for patients. The researchers suggested that AI tools could help clinicians explain imaging results immediately to patients, which can reduce patient anxiety and physician workload.

Jankelson and colleagues evaluated the performance of ChatGPT-4 for creating patient-facing echocardiogram reports. They used the chatbot to explain reports from 100 patients. Five echocardiographers evaluated the generated explanations on five-point Likert scales. They measured ChatGPT-4’s performance for acceptance, accuracy, relevance, understandability, and representation of quantitative information. Additional questions assessed the importance of inaccurate or missing information.

The echocardiographers either agreed or strongly agreed that 73% of the GPT-generated explanations of echocardiograms were suitable to send to patients without modifications. Additionally, the experts rated 84% of the explanations as “all true” and 16% as “mostly correct.”

The echocardiographers also deemed 76% of the AI-generated explanations as containing all the important information. They also found that 15% of the explanations had most of the vital information.

The experts reported that none of the generated explanations that had incorrect or missing information were potentially dangerous and noted median Likert scores of either four or five for accuracy, relevance, and understandability.

Jankelson's team used Flesch-Kincaid grading to determine readability of the generated explanations. The median grade level for echocardiogram reports was 10.8 and the median Flesch reading ease score was 42.2 (out of 100), corresponding to a college reading level. The GPT-generated explanations showed a median grade level of 9.4 and a median reading ease score of 62.7, corresponding to between an 8th- and 9th-grade reading level.

Diving further, the team excluded technical language directly repeated from the echocardiogram reports and scored only the explanatory text. This led to the median grade level decreasing to 7.3 and the median reading ease score rising to 75.9, a 7th-grade reading level.

The study authors highlighted that despite the promising results, clinician oversight remains essential for AI-generated reporting. They added that translating echocardiogram reports into non-technical language “likely carries a lower risk of patient harm in comparison to other clinical tasks that involve critical thinking, medical advice, or diagnosis.”

“These findings support future prospective studies of AI-generated echocardiogram explanations in clinical practice,” the authors wrote.

The full study can be found here.