Deep learning is on par with radiologists when it comes to ultrasonic detection and grading of hepatic steatosis, according to Canadian research published October 3 in Radiology.
A team led by Pedro Vianna from Centre de Recherche du Centre Hospitalier de l’Université de Montréal found that deep learning applied to B-mode ultrasound could classify dichotomized steatosis grades at a comparable rate to that of human readers.
“The performance of our model suggests that deep learning may be used for opportunistic screening of steatosis with use of B-mode ultrasound across scanners from different manufacturers or even for epidemiologic studies at a populational level if deployed on large regional imaging repositories,” Vianna and co-authors wrote.
Hepatic steatosis refers to the presence of vacuoles of fat within liver cells, a histopathologic feature. A general rule is the higher the steatosis grade, the worse the outcome is.
Despite ultrasound’s limitations in steatosis imaging, the researchers suggested that deep-learning approaches could help overcome these limitations. They also noted a lack of data comparing deep learning’s performance with human readers on the same test dataset.
Vianna and colleagues sought to investigate the classification agreement and diagnostic performance of radiologists and deep-learning model applied to B-mode ultrasound images for grading liver steatosis in nonalcoholic fatty liver disease. They used biopsy as the reference standard. For the study, they also used the VGG16 deep-learning architecture, which features a “moderate” depth and availability of pretrained weights on the ImageNet dataset. The team also used fivefold cross-validation for training.
The study included 199 patients with an average age of 53 years. Of these, 101 patients were male and 98 were female. The researchers found that the deep-learning model had higher area under the curve (AUC) values for steatosis grades of 0 versus 1 and comparable performance for higher steatosis grades. This included a test set of 52 patients.
Comparison between radiologists, deep learning with liver ultrasound | ||
---|---|---|
Steatosis grades | Radiologists | Deep learning |
S0 vs. S1 or higher | 0.49-0.84 | 0.85 |
S0 or S1 vs. S2 or S3 | 0.57-0.76 | 0.73 |
S2 or lower vs. S3 | 0.52-0.81 | 0.67 |
The researchers also reported that the radiologists had the following interreader agreement values: 0.34 (grades S0 vs. S1 or higher), 0.3 (grades S0 or S1 vs. S2 or S3), and 0.37 (grades S2 or lower versus S3).
For the S0 versus S1 or higher range, the deep-learning model achieved a significantly higher AUC in 11 of 12 readings (p < 0.001). In S0 or S1 versus S2 or S3, the deep-learning model showed no statistically significant difference, while in S2 or lower versus S3, the deep-learning model achieved an AUC that was significantly better than one reading (P = .002).
“For human visual classification, the sensitivity was the highest when distinguishing patients with steatosis from those without steatosis,” the researchers wrote. “This could be helpful in the clinical setting to screen patients with any level of fat.”
The study authors wrote that their results support the need for future multicenter studies to validate deep-learning models for liver steatosis with B-mode ultrasound imaging.
In an accompanying editorial, Theresa Tuthill, PhD, from the ultrasound research group at GE HealthCare wrote that these results “provide a foundation” for using AI and machine-learning algorithms on liver B-mode ultrasound scans to screen for liver steatosis. However, she added that along with regulatory approval, the technology needs to eliminate factors tied to race, socioeconomic status, or concomitant diseases to increase its robustness and generalizability.
“Thus, additional studies mimicking real-world use are needed to fully implement AI-integrated approaches to detect early-stage liver steatosis at ultrasound,” she wrote.
The study can be found in its entirety here.