AI algorithm could reduce wrong-side medical mistakes

Aug 4, 2019

2019 06 20 19 40 2118 Lung X Ray Marker 400

Wrong-side medical procedures are among the most devastating -- and preventable -- errors in healthcare. But what if you could stop wrong-side mistakes simply by analyzing radiographs with a machine-learning algorithm? A research group gave the idea a try and described their findings in the August edition of the Journal of Digital Imaging.

Medical errors involving procedures on the incorrect body part or the wrong side of the patient continue to persist despite checklists and other best-practice procedures to prevent them. One contributing factor to wrong-side procedures is absent or incorrect laterality information about the proper orientation of the patient's x-ray, according to Drs. Ross Filice and Shelby Frantz of Georgetown University (J Digit Imaging, August 2019, Vol. 32:4, pp. 656-664).

Indeed, up to half of the 500,000 radiographs performed each year at Georgetown University lack properly encoded DICOM metadata indicating laterality, with a small fraction indicating incorrect laterality. Incorrect laterality information in radiographs increases the likelihood of clinical errors downstream as x-ray images are used for treatment planning and decision-making.

At the same time, machine-learning and deep-learning technologies have been emerging as tools that are potentially useful for a variety of clinical applications. While most of the high-profile work has focused on image interpretation, Filice and Frantz speculated that a deep-learning algorithm could be developed to classify radiographs by laterality.

The research group first analyzed a set of radiographs that included laterality-specific information: 15,405 unique images from 4,619 exams of nine different body parts. Images were reviewed to ensure that they were categorized correctly and errors were fixed. Lead markers within the images were considered ground truth; images on which laterality could not be determined were placed into an "unknown" category.

Next, the group developed machine-learning models for classifying the images using pretrained GoogLeNet or AlexNet networks. The algorithms were refined to allow simultaneous detection of two different objects: the "L" and "R" lead markers that denote the left and right side of the patient.

The model was designed to return a prediction for each class -- namely "L," "R," or "U" -- with a confidence score indicating the algorithm's confidence in its own prediction. In studies with more than one image, laterality was determined by majority rule or confidence scores.

How well did the algorithm perform? It produced an area under the curve of 0.999 for correctly classifying R and L laterality, which was comparable to the performance of a radiologic technologist. In one of the tests on a batch of 292 test images, the algorithm performed as follows, with results classified by type of marker and at the image and exam level:

Accuracy of deep-learning model for classifying radiograph laterality
	Sensitivity	Specificity	Accuracy
Images with L markers	100%	99.3%	99.7%
Images with R markers	99.3%	100%	99.7%
Exams with L markers	100%	100%	100%
Exams with R markers	98%	100%	99%

So how would the model be used in clinical practice?

It could be employed either to automatically classify a historical dataset of radiographs that don't have correct DICOM encoding of laterality or to perform quality assurance at the time that exams are acquired. In addition to preventing medical errors, the algorithm could also improve the automated selection of hanging protocols, in which prior images are retrieved based on the current images being interpreted.

"Since we can process individual images in a few seconds, these models could be deployed to process images on or even prior to archive ingestion and before interpretation by the radiologist with notifications to the technologist in cases of possible labeling error or unlabeled data," the authors wrote.

But the model's greatest impact could be in error prevention, they concluded.

"It has been well demonstrated in similar cases, such as flagging report errors or providing feedback on exam duration, that analogous continuous quality assurance feedback results in consistent error correction and lower baseline error rates," they wrote. "Immediate and consistent feedback applications hold the potential to improve awareness and baseline functioning and are essential in a field where error must be minimized."