Motivation: Foundation models have significant potential in organ segmentation. However, They would face challenges regards effectiveness and fairness. Goal(s): This study aims to evaluate the effectiveness and fairness of foundation models in multi-organ segmentation. Approach: Foundation models (SAT, SAM, MedSAM, and TotalSegmentator) were included in this study. Bounding box prompts were used with SAM and MedSAM. DICE was used to evaluate segmentation effectiveness. For fairness evaluation, the DICE was grouped by gender, age, and BMI, and ANOVA was used to compute significance between groups. Results: The effectiveness of SAM and MedSAM is more significant than other mdoels. However, their fairness needs to be improved. Impact: This study systematically evaluates the variations in segmentation effectiveness of foundation models across different organs and the fairness issues, which finds the shortcomings of the current foundation models and plays an important role in guiding future improvements of foundation models.
Li et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: