Determining which patients need contrast for musculoskeletal MRI scans can soak up a lot of time and resources. What if you could use artificial intelligence instead? Researchers found that IBM Watson could help optimize the process, according to a September 18 paper in the Journal of Digital Imaging.
Researchers at the University of California, San Francisco (UCSF) evaluated the accuracy of three techniques for determining which patients should get contrast for musculoskeletal MRI scans based on clinical indications: IBM Watson, traditional computers, and a radiologist. They found that Watson achieved an overall accuracy of 83.2% when compared with the original radiologist (JDI, September 18, 2017).
"If applied in a real clinical setting, Watson can preliminarily assign for a musculoskeletal MRI contrast protocol and have the radiologist then make the final [designation]," senior author Dr. Jae Ho Sohn told AuntMinnie.com. "This would reduce time and improve accuracy."
Forward thinking
Sohn personally encountered IBM Watson for the first time at Stanford's Health Hackathon (health++) in 2016, and he was initially drawn to the technology due to its massive display of computing horsepower. Sohn was particularly captivated by Watson's knack for natural language processing, in which computers are used to analyze and derive meaning from large blocks of unstructured text.
He believed there could be an application for this skill in the field of radiology. And as a lifelong student of math and a resident physician, Sohn aspired to combine those two areas.
"Whenever there is a cross-pollination of ideas from another field with medicine, it is a point of creativity and innovation," he said.
In radiology, Sohn and colleagues recognized the high cost of MRI and the extensive effort it involves for precise interpretation. The modality requires numerous scanning protocols, including assigning contrast, that are subject to variation, if not error. What's more, MRI contrast agents have been shown in previous studies to have potentially harmful consequences when inappropriately assigned, such as allergic reactions or the development of nephrogenic systemic fibrosis. Concerns over gadolinium deposition are another reason to reduce unnecessary MRI contrast use.
Noting such issues, the researchers set out to use the language processing capacity of Watson to create an algorithm for automatically assigning intravenous contrast for musculoskeletal MRI protocols. They anticipated that the algorithm would help improve productivity and decrease the likelihood of a suboptimal study.
Smart enough?
Sohn and colleagues started with 1,544 musculoskeletal MRI exams, with protocols and clinical indications in free text. Radiology residents and fellows had previously determined whether or not contrast should be used with the scans.
These contrast assignments were then compared with determinations made in three different ways:
- By a radiologist
- Using a series of eight different models of traditional machine-learning algorithms on personal laptops
- From IBM's Watson Bluemix Natural Language Classifier
A second reader was used to make contrast assignments based on free text to determine interreader agreement.
Data preprocessing of the free text was also required before the researchers could enter it into the traditional machine-learning algorithms. They did this by manually stripping punctuation marks, white space, and filler words that did not add to the clinical meaning, such as "and," "with," "please," etc. This rigorous, time-consuming procedure is performed automatically by the IBM Watson Natural Language Classifier, according to the researchers.
Out of the 1,520 MRI cases examined (24 from the initial total were excluded due to unresolvable ambiguity), 1,240 were used to "train" the Watson algorithm and 280 to test it.
Watson was relatively accurate in its assignment of intravenous contrast for MRI exams using only the provided free-text clinical indication.
Accuracy of algorithms for assigning MSK MRI contrast | |||
Traditional machine-learning algorithms (vs. original reader) | Watson (vs. original reader) | Watson (vs. second reader) | |
Accuracy | 80% | 83.2% | 88.6% |
Watson's Natural Language Classifier turned out to be an improvement over the eight traditional machine-learning algorithms. The overall accuracy of the independent traditional machines ranged from 70% to 75%, and it only shot up to 80% after they were put together in an ensemble. Even then, the collective eight did not match IBM Watson's level of accuracy, Sohn noted.
When compared with the original protocol, Watson achieved a sensitivity of 74.3%, specificity of 92.1%, positive predictive value of 90.4%, and negative predictive value of 78.2%. Furthermore, looking exclusively at cases in which the original radiologist and second radiologist agreed, Watson's accuracy climbed even further, reaching 90%.
Striving for perfection
On top of high accuracy in assigning MRI contrast, IBM Watson also has the advantage of functioning without preprocessing, and those who operate it need only a minimal understanding of machine-learning fundamentals, according to the researchers.
Those benefits aside, Watson still committed 47 errors. Among them, 22 were disagreements with the first radiologist but not the second, primarily due to a lack of additional clinical data that was only available to the original physician. The remaining 25 errors could have been due to spelling and grammar mistakes or any number of reasons that are still unclear.
There was a single critical error among the mix. Watson assigned gadolinium-based contrast to a patient with end-stage renal disease, which could be seriously harmful, if not fatal.
"One of the downsides of this algorithm is that sometimes it does commit the mistake of assigning contrast when it shouldn't have," Sohn said. "We cannot accept even one error in this aspect. These small types of errors are critical."
Overall, however, Watson did present a favorable error profile because it erred on the side of not assigning contrast, he offered. The false-negative rate (not assigning contrast when it should have) was higher than the false-positive rate (assigning contrast when it shouldn't have), which is the safer of the two errors.
A suitable helper
A major strength of using Watson to assist with MRI protocols is that it would not alter normal workflow all that much, Sohn said.
What would Watson look like in clinical practice? The clinician would put in the order for the MRI with free-text clinical indications as usual, and Watson would preliminarily assign it a "with" or "without" contrast protocol. At this point, the radiologist would review the recommendation and give it a thumbs up or thumbs down, he explained.
In this scenario, Watson would be less a competitor of radiologists than a support tool -- or better yet, a helping hand.
"So radiologists have the final say, while IBM Watson reduces the amount of work and increases efficiency," he said. "If anything, we should be excited about the opportunities it can open up for us."
The UCSF research team has been walking down that very path: Sohn is now overseeing a number of studies using deep learning to assist with even more complex MRI protocols than the binary "yes" or "no" for contrast, such as what sequences to use and how patients should be positioned.
Early data from these studies have been promising, but they are not ready for full-scale clinical integration just yet.
"There are some ongoing studies in which I've been applying IBM Watson," Sohn said. "They've reached 93% accuracy, but it's still not enough. Even if it achieves 99%, the 1%, if it can harm a patient, means it's not enough. It's going to require a little more time and a lot more resources, but we're really happy with the information we've been gaining."