BOSTON -- A new method that utilizes probability data extracted from radiology reports for specific conditions could potentially assist in differential diagnosis, according to a Monday presentation at the Conference on Machine Intelligence in Medical Imaging (CMIMI).
Making use of this probability data, on-demand diagnostic models can be created to solve clinical problems by applying the frequencies of diseases and imaging findings for a specific patient population, according to Charles Kahn Jr., MD, of the University of Pennsylvania in Philadelphia. He presented the research October 21 at the meeting, which is hosted by the Society for Imaging Informatics in Medicine (SIIM).
“We can take probability data mined from radiology reports to inform Bayesian networks for radiological diagnosis,” Kahn noted.
Bayesian network models, which apply probability theory to perform diagnostic reasoning, can overcome some of the challenges associated with modern neural networks, according to Kahn.
“They can explain their reasoning and account for missing or counterfactual information,” he said. “You can do a lot of interesting things with them.”
However, construction of these models is often limited by a lack of real-world data, Kahn noted.
To develop their Bayesian network model, the researchers first utilized an application programming interface (API) to access the Radiology Gamuts Oncology (RGO), a reference source of more than 2,000 radiology differential-diagnosis listings. Next, they applied software to detect RGO entities on two years of radiology reports produced from approximately 1.8 million reports on 1.3 million patients at a large U.S. academic health system.
The software used an approach called named entity recognition, along with negation detections.
“In other words, if something said ‘no evidence of pneumothorax’, it wasn’t counted as an occurrence of pneumothorax,” he said.
A matrix of 17,000 entities was produced across all of the patients. For each of the entities, the occurrences were tabulated. A pair of entities can also be analyzed for co-occurrences.
Of the 17,000 entities in the ontology, about 2,700 entities actually occurred within the dataset of radiology reports used in the study, Kahn said.
The software was able to aggregate probability data around the specified RGO entity and entities that could cause or be caused by it, according to Kahn.
The generated Bayesian network model was encoded in the Structural Modeling, Inference, and Learning Engine (SMILE) format and the GeNIe platform (BayesFusion) performed diagnostic inference.
For a specific RGO entity – for example, ascites – the age and sex distribution for the condition is displayed along with probabilities computed based on the entities' co-occurrence.
“What that allows you to do is build out a model specific for the condition at hand,” Kahn said.
Data can be used to look at all possible causes for the condition and derive their probabilities based on clinical practice, according to Kahn.
“One of the strengths of the approach like this is it really lets you tune your differential diagnosis to your specific patient population,” he said.
In future work, “We are looking at this as a way to integrate textbook data that you have with these gamuts-type listings with local data that informs it and look at ways to use this to help evaluate diagnostic effectiveness,” Kahn said.
In addition, researchers are utilizing this technique in a parallel research project for the diagnosis of rare diseases.
“These can be very complex and because of their infrequency, they can be very challenging to formulate a differential diagnosis,” he said.
Check out AuntMinnie.com's coverage of CMIMI 2024 here.