A hybrid computer-aided detection (CAD) algorithm for lung nodules showed high sensitivity with very few false positives, according to new research in Medical Physics. The new technique promises to make CAD faster and more reliable for use in the burgeoning field of lung cancer screening with CT.
The technique is unique in how it integrates several existing algorithms for lung nodule detection to capture the wide variety of shapes, intensities, and locations in which nodules occur, according to the authors. In a study of 631 clinically significant lung nodules from a validated database, the CAD scheme showed sensitivity of about 85% with false-positive detections of about 2%.
"We developed a new method that has high sensitivity and a very low false-positive rate," co-author Binsheng Zhao, DSc, from Columbia University Medical Center, told AuntMinnie.com.
Big workload
Owing to the high death rate for lung cancer and the emergence of CT lung cancer screening, there is a growing need for screening programs that can manage large numbers of screen-detected lung nodules, appropriately triaging patients to either follow-up or intervention.
This clinical need dovetails with the rapid analysis of images that researchers have pursued vigorously in recent years.
"Computer-aided detection of lung nodules on computed tomography scan images is one of the most active research areas in recent decades," wrote Zhao, along with Lin Lu, PhD; Yongqiang Tan, PhD; and Dr. Lawrence Schwartz (Medical Physics, September 2015, Vol. 42:9, pp. 5042-5054).
But the wide diversity of lung nodules, encompassing different shapes, patterns, and sizes, makes it difficult for current CAD schemes to detect and manage them competently. Also, despite significant progress in developing algorithms, there is still no widely accepted methodology or algorithm for CAD of lung nodules, particularly when dealing with large datasets, the authors wrote.
Their study therefore aimed to solve the problem of nodule diversity by using a hybrid method to detect and characterize the widest possible variety of nodules.
Different schemes for better detection
Researchers have developed many CAD schemes for use in thin-section CT scans based on different algorithms and discrimination techniques. These can be divided into three categories: intensity-based schemes, shape-based schemes, and machine-learning schemes.
Intensity-based schemes
These schemes rely on differences in CT values between nodules and the surrounding lung parenchyma. Previous research has included the development of a density maximum algorithm to detect nodules with high-density structures.
Applying dot-enhancement filters to the original CT data can boost nodule detection while suppressing anatomic structures that could confound detection. In addition, intensity thresholding combined with morphological processing can be used to generate candidate nodules for detection, according to Zhao and colleagues.
Shape-based schemes
These schemes use statistical models to characterize lung nodules and search for matching object shapes in the image space, the authors explained. One study used a fit-based method to fit nodules by ellipsoid shapes based on Gaussian filters.
Another group used mass-spring models, which represent the gray value range and the shape, for nodule matching. Still another method identified initial candidate nodules based on surface overlap. Convergence index filters have also been used to match round lesions, the authors wrote.
Machine-learning schemes
These methods show candidate nodules as a group of extracted features, using pretrained classifiers to discriminate between them. One study used a multithreshold method to classify nodules of different intensity levels. Another used shape and curvedness as nodule features, reducing false-positive detections with the aid of k-nearest-neighbor classifiers.
"Previous studies that [generated] candidate nodules did not differentiate between the nodules," Lu said.
Lung nodules can have very different morphologies in large or small sizes and different shapes, he said.
"The advantage of our algorithm is that we precategorize the nodules into different groups," added Zhao. "We group the nodules based on having similar appearances -- that's the trick of this hybrid."
An advantage of this approach is that the categorizations are "soft" rather than "hard" -- a feature designed to capture as many nodules as possible in an environment where categories are often blurred, the authors wrote.
High sensitivity, few false positives
The CAD scheme in the current study consists of three modules. The first generates candidate nodules, and the second classifies them by location as peripheral, chest wall, or mediastinal. The third module, responsible for detection, consists of detection nodes (each with a maximum of 10 features) organized in a tree structure. The candidates are thus separated by location, size, and shape, the authors wrote.
The investigators tested the method on 294 CT scans obtained from the public Lung Image Database Consortium (LIDC); these were divided into a training set (196 scans) and a testing set (98 scans). Each nodule was evaluated by expert readers as a requirement for inclusion in the database.
For the training set, the sensitivity was 87%, with a false-positive rate of 2.6% per scan. For the testing set, the sensitivity was 85.2%, with a false-positive rate of 3.1%.
As for limitations, the CAD scheme needs improvement with regard to small nodules, which are easily confused with pulmonary scars, as well as irregularly shaped nodules, the authors noted.
The hybrid method yielded high performance on the evaluation data and showed advantages over existing CAD schemes, they concluded. The method would be useful for a wide variety of CT imaging protocols, both in routine diagnosis and lung cancer screening studies.
"There is no data to compare our results with other methods, but the good thing is that we used the publicly available LIDC dataset -- a National Cancer Institute-sponsored project," Zhao said. "This dataset actually includes different levels of quality."
So the field is open for more research, because other groups can compare these results to their methods using the same data. The paper mentioned prior studies from several groups and "we outperformed the other algorithms," she said.