Hip fractures can be detected and classified by an artificial intelligence (AI) algorithm on radiographs with a high level of accuracy, even outperforming expert clinicians, according to research published online February 8 in Scientific Reports.
A team of researchers from the U.K. led by E.A. Murphy of the University of Bath developed deep-learning algorithms to locate the hip joints on radiographs and then, if there's a fracture, classify the fracture type. In retrospective testing, their method was nearly 20% more accurate than the original clinical interpretations.
"We envisage that this approach could be used clinically and aid in the diagnosis and in the treatment of patients who sustain hip fractures," the authors wrote.
Although classification of hip fracture strongly determines surgical treatment and patient outcomes, there is currently no standardized process in the U.K. as to who -- an orthopedic surgeon or a radiologist specializing in musculoskeletal disorders -- determines this classification, according to the researchers.
Speed of diagnosis is also important.
"For hip fracture management, the ability to accurately and reliably classify the fracture swiftly is paramount as surgery should occur within 48 [hours] of admission, because delays in surgery increase the risk of adverse patient outcomes such as mortality," the authors wrote.
As a result, the researchers sought to apply AI techniques for detecting and classifying hip fractures on plain radiographs acquired during routine clinical care.
To establish ground truth for the study, 3,659 hip radiographs were retrospectively and independently classified by at least two musculoskeletal experts -- a consultant orthopedic consultant and/or a consultant musculoskeletal radiologist -- to achieve a consensus interpretation. If consensus was not achieved by the two experts, a third expert was asked to interpret the study. And if this third expert did not agree with either of the first two experts, two other experts read the study to determine the consensus interpretation.
Next, they trained two convolutional networks -- one to identify and extract the hip on the hip radiographs and the other to classify the hip as either: not fractured or if fractured, as a trochanteric fracture or an intracapsular fracture.
After applying the algorithms to a test set of cases, these results were then compared with the original clinical interpretations. Overall, the algorithm produced an area under the curve (AUC) of 0.98 for radiographs without a hip fracture, 0.99 for trochanteric hip fractures, and 0.97 for intracapsular fractures.
Impact of AI algorithm on identification and classification of hip fractures | ||
Original clinical interpretation | AI algorithm | |
Accuracy | 77.5% | 92% |
In other results, the researchers found no significant correlation between the number of experts that were needed to agree on the image classification and whether the radiograph was correctly classified by the AI algorithm.
"This indicates that human observers and the machine learning algorithm did not find the same fractures challenging to classify," the authors wrote.
The authors noted that, "... this analysis is a prototype only and a more extensive study is needed before this approach can be fully transformed to a clinical application."