AI algorithm speeds up assessment of bone age

2017 09 13 22 17 2120 Bone Ai 20170913225840

Estimating patient age on radiographs can be a little burdensome for radiologists and subject to high interreader variability. Artificial intelligence (AI) can make this task easier, however, by analyzing data from skeletal radiographs, according to research published online September 12 in the American Journal of Roentgenology.

A research team from Korea trained a deep-learning algorithm that yielded estimates of bone age that correlated significantly with a reference bone age. When used as a second reader in testing that mimicked daily clinical practice, the software also helped two radiologists read the cases an average of 29% faster.

"[Our] automatic software system showed reliably accurate bone age estimations and appeared to enhance efficiency by reducing reading times without compromising the diagnostic accuracy," wrote the group led by Dr. Jeong Rye Kim of Asan Medical Center in Seoul.

A burdensome read

Bone age estimation is a critical task for determining developmental status and predicting ultimate height in pediatric patients, particularly those with growth disorders and endocrine abnormalities. The most popular method for estimating bone age is the Greulich-Pyle method, which involves comparing the patient's left-hand wrist radiograph with standard radiographs in the Greulich-Pyle atlas.

"However, the process of bone age estimation, which comprises a simple comparison of multiple images, can be repetitive and time-consuming and is thus sometimes burdensome to radiologists," the authors wrote. "Moreover, the accuracy depends on the radiologist's experience and tends to be subjective."

Concerns over interreader variability have led to the development of a variety of automatic computerized bone age assessment methods over the past 25 years, including computer-assisted skeletal age scores, computer-aided skeletal maturation assessment systems, and the BoneXpert CAD software (Visiana), according to the researchers. Developed via traditional machine-learning techniques, BoneXpert has been shown in a number of studies to provide good performance in a variety of clinical settings and in patients with different ethnicities.

The power of deep learning

Seeking to leverage the power of deep learning to achieve even better performance, the Korean researchers trained a deep-learning algorithm to estimate bone age using 18,940 left-hand wrist radiographs from Asan Medical Center. All radiographs were labeled with a bone age estimated using the Greulich-Pyle method.

The resulting automatic software, which is activated when a radiologists begins to read a left-hand radiograph for bone age estimation, displays the three most likely estimated bone ages in order of probability percentage. The software was developed by Vuno, a Seoul-based AI software firm, and Vuno Senior Researcher Sangki Kim was senior author on the paper.

Bone age assessment provided by deep learning-based software on the wrist radiograph of a boy with a chronological age of 5 years and 1 month. The software's first-rank estimate of bone age was 5 years with a probability of 71%. The second- and third-rank bone ages were 6 years (23% probability) and 4 years and 6 months (6% probability). Image courtesy of AJR.Bone age assessment provided by deep learning-based software on the wrist radiograph of a boy with a chronological age of 5 years and 1 month. The software's first-rank estimate of bone age was 5 years with a probability of 71%. The second- and third-rank bone ages were 6 years (23% probability) and 4 years and 6 months (6% probability). Image courtesy of AJR.

The researchers performed a retrospective study to evaluate the software program's accuracy and efficiency, as well as its feasibility for use in clinical practice. Radiographs were collected from 200 pediatric patients who received left-handed wrist radiography at their institution's children's hospital between July and November 2016 (AJR, September 12, 2017).

The patients had a mean age of 9.8 years (range, 3-17 years); patients younger than 2 were excluded from the study, as the Greulich-Pyle method is not suitable for use in that age group, according to the researchers. For the purposes of the study, the reference bone age was determined independently by two experienced pediatric radiologists; in cases of disagreement, the radiographs were evaluated until a consensus was reached. A third experienced radiologist resolved any remaining disagreements.

All 200 radiographs were read over different sessions by two radiologists: a clinical pediatric fellow and a second-year radiology resident. The clinical pediatric fellow had experience with more than 500 cases of Greulich-Pyle atlas-based bone age estimation, while the radiology resident had no clinical experience in estimating bone age. The resident did, however, complete a one-day training course directed by an experienced pediatric radiologist prior to the study.

In the first session of 100 radiographs, the reviewers independently estimated the bone age manually using the paper form of the Greulich-Pyle atlas. A week later, the radiologists again read the same radiographs -- in a different order -- with help from the software. They compared the software's three highest probability choices with the atlas to arrive at a decision. The remaining 100 radiographs were read in a similar fashion in two future sessions.

Significant correlation

The software's highest probability of bone age (i.e., the first-rank choice) had a concordance rate of 69.5% and a significant correlation (r = 0.992, p < 0.001) with the reference bone age. The second-rank bone age estimate had a 17% concordance rate with the reference bone age, while the third-rank bone age had a 7% concordance rate.

"Therefore, the results of 93% of cases listed in terms of the first-, second-, and third-rank bone ages matched the reference bone age," the group wrote.

The researchers also noted that the software's first-rank choice had a root mean square error (RMSE) of 0.60 years compared with the reference bone age. In previous validation studies in the literature, the BoneXpert software had RMSE values of 0.61 to 0.80 years.

"Therefore, we believe that this new automatic software shows relatively accurate results, with a strong possibility that the most appropriate bone age exists among three most likely bone ages (i.e., first, second-, and third-rank bone ages)," the researchers wrote. "Because our study is preliminary and uses a deep learning-based bone age estimation program that is still under development, the accuracy including the concordance rate is expected to increase through further developments of the deep-learning technique and the addition of more data input."

Better efficiency

While the use of the software increased the concordance rate with the reference bone age for both the clinical pediatric fellow (63% without the software to 72.5% with the software) and the radiology resident (49.5% to 57.5%), these gains did not reach statistical significance. The software did yield significant improvements in efficiency for both readers, however.

Effect of bone age software on reading times
  With Greulich-Pyle paper atlas only With assistance from bone-age assessment software Percentage of time savings
Clinical pediatric fellow 188 minutes, 22 seconds 154 minutes, 31 seconds 18%
Second-year radiology resident 180 minutes, 55 seconds 108 minutes, 33 seconds 40%

"When the automatic software system was implemented as the second opinion in daily clinical practice, the radiologists' reading times were reduced by 29%, on average, without compromising the accuracy of bone age estimation," the authors wrote.

The researchers acknowledged a number of limitations to their work, including its reliance on a small sample size at a single clinical center with patients of a single ethnicity. In addition, the software showed a predicted bone age that was significantly different than the reference bone at certain ages, notably between 12 and 15 years old.

"More data regarding these age groups should be included in the additional learning dataset to increase the accuracy of this software program," they wrote.

Further study should also include assessment of efficiency differences between a senior radiologist and a less-experienced radiologist, according to the group.

Page 1 of 372
Next Page