To comprehensively evaluate and compare the performance of six advanced deep learning architectures for automatic segmentation of pancreatic tumors and adjacent anatomical structures in contrast-enhanced CT images. The multi-center study included 1968 CT scans that were acquired from 6 different medical institutions and thus covered a large range of imaging hardware and acquisition protocols. Six anatomical structures - pancreatic tumour, parenchyma, duct, common bile duct, venous system, and arterial system were manually segmented by experienced radiologists. Intensity normalization, spatial resampling and data augmentation were used as pre-processing steps. The evaluated architectures were U-Net, nnU-Net, ATTENTION-BASED RESIDUAL U-Net, DEEPLABV3+, SEGNET and SWIN-UNET. Each network was trained using a Dice loss together with cross-entropy and the Adam optimizer with a cosine learning rate scheduler. Internal and external validation was performed on an entirely independent external cohort (n = 536) using the following measures: Dice coefficient, Jaccard Index, Precision, Recall and 95th percentile Hausdorff Distance (HD95). Wilcoxon signed-rank tests were used to compare the two groups; Bonferroni correction for multiple comparisons was used. Swin-Unet outperformed all other models in terms of volumetric and boundary metrics: Dice scores were obtained ranging from 0.94 to 0.96 and HD95 as low as 2.9 mm. nnU-Net achieved the second place and proved to be highly generalizable, although slightly lower accuracy was obtained for small/material tortuous structures. Transformer and attention-enhanced models significantly outperformed classical CNNs, especially in segmenting ducts and vascular anatomy. We demonstrated that the robust generalisation of Swin-Unet was validated externally with minimal degradation in performance when generalized to different institutions. Statistical results showed that Swin-Unet is significantly superior (p < 0.001) in all considered structures. Among the six contemporary deep learning architectures compared in this large multi-center CT study with external validation, Swin-Unet achieved the highest segmentation accuracy and generalization, particularly for vascular structures and pancreatic parenchyma, while performance on thin and low-contrast ductal structures remains an area for further improvement.
An et al. (Tue,) studied this question.