AI can distinguish between types of spondylitis on MRI

Sep 9, 2018

An artificial intelligence (AI) algorithm can differentiate between tuberculous (TB) spondylitis and pyogenic spondylitis on MRI at a level comparable to that of experienced musculoskeletal radiologists, according to research published online September 3 in Scientific Reports.

Researchers led by Dr. Kiwook Kim and Dr. Sungwon Kim of Yonsei University College of Medicine in Seoul, South Korea, trained a deep convolutional neural network (DCNN) to provide a quantitative value for all lesions detected on a spine MRI. This "deep-learning score" was then used to differentiate between TB and pyogenic spondylitis. In testing, their algorithm yielded slightly higher diagnostic performance than pooled classifications by three musculoskeletal radiologists, though the results were statistically equivalent.

Diagnoses of infectious spondylitis have increased lately due to factors such as a growing number of older adults and immunocompromised patients, the universal use of invasive spinal procedures or surgeries, and the improving accuracy of imaging diagnosis thanks to advances in diagnostic imaging methods, according to the researchers.

It's important to differentiate between TB and pyogenic spondylitis -- the most common causes of infectious spondylitis -- because rapid anti-TB treatment can prevent future disability. However, "distinguishing between the two using MR images alone remains a challenging task, even for skilled radiologists," the authors wrote.

Postulating that a DCNN algorithm could perform well for this task, the researchers trained an object detection and classification model as a DCNN, making use of data augmentation techniques that increased the size of the training dataset. Training and validation were performed based on each lesion using the object-detection aspect of the DCNN.

"However, the final decision for each patient was made using [a mean deep-learning] score, which is a comprehensive quantitative value of all lesions detected within all slices of a MR imaging study," they wrote.

The mean deep-learning score for TB spondylitis patients was 0.570, compared with 0.266 for pyogenic spondylitis. The difference was statistically significant (p < 0.001). A score of 0.313 was identified as the optimal cutoff threshold for differentiating between the two conditions.

Three board-certified musculoskeletal radiologists with 10, nine, and seven years of experience in musculoskeletal imaging interpretation also independently evaluated the 161 cases used in the study. The DCNN classifier correctly classified 123 patients, while the three radiologists correctly classified 113, 112, and 114 patients, respectively. The researchers calculated the area under the receiver operating characteristic curve (AUC) to compare the performances of the DCNN classifier and the radiologists.

Radiologists vs. DCNN algorithm for differentiating TB spondylitis and pyogenic spondylitis
	Pooled performance of 3 radiologists	DCNN classifier
Area under the curve	0.729	0.802

While the algorithm yielded higher accuracy in testing, the difference in performance was not statistically significant (p = 0.079).

"It was encouraging that the DCNN classifier had a similar AUC to that of the radiologists, considering the similarity between the MR findings of the two diseases," the authors wrote.

They noted that a larger-scale study with additional collection of multiplane MR images needs to be performed to further validate their deep-learning scorning method.