Abdominal organ segmentation is a rapidly advancing area in medical imaging, boasting numerous remarkable applications in clinical and research settings. Despite these advancements, most existing segmentation models are developed using single-source data. This homogeneity raises concerns regarding the applicability of these models to more diverse and complex clinical scenarios. This study aimed to develop a generalizable model for the semantic segmentation of abdominal organs using three widely recognized public datasets: BTCV, AMOS, and TotalSegmentator. Extensive cleaning and preprocessing were undertaken to address the challenges posed by data heterogeneity. The merging process resulted in a diverse and comprehensive dataset of 680 CT scans encapsulating varied image conditions and anatomical representations. The comparative analysis utilized two architectural frameworks: nnUNet, representing Convolutional Neural Networks, and Swin-UNETR, embodying Vision Transformers. Results demonstrate the superiority of the nnUNet model across all experiments, demonstrating superior robustness and adaptability under diverse conditions and unseen cases. However, further research could contribute to achieving a more balanced performance across patient groups. With an average Dice Similarity Coefficient of 92.3%, the developed nnUNet model establishes itself as a highly effective and competitive approach in abdominal organ segmentation.
Llopis et al. (Mon,) studied this question.