What type of study is this?

This is a Cohort Study study (also classified as: Quantitative Study).

September 17, 2025

Evaluation of effectiveness and fairness of foundation models in multi-organ segmentation

Key Points

Effectiveness of foundation models like SAM and MedSAM is significantly higher than others, yet fairness requires enhancement.
DICE metric was employed to evaluate segmentation effectiveness with notable outcomes for SAM and MedSAM.
ANOVA was utilized to assess fairness, revealing discrepancies across groups defined by gender, age, and BMI.
Findings highlight the need for improved fairness in foundation models, guiding future advancements in multi-organ segmentation.

Abstract

Motivation: Foundation models have significant potential in organ segmentation. However, They would face challenges regards effectiveness and fairness. Goal(s): This study aims to evaluate the effectiveness and fairness of foundation models in multi-organ segmentation. Approach: Foundation models (SAT, SAM, MedSAM, and TotalSegmentator) were included in this study. Bounding box prompts were used with SAM and MedSAM. DICE was used to evaluate segmentation effectiveness. For fairness evaluation, the DICE was grouped by gender, age, and BMI, and ANOVA was used to compute significance between groups. Results: The effectiveness of SAM and MedSAM is more significant than other mdoels. However, their fairness needs to be improved. Impact: This study systematically evaluates the variations in segmentation effectiveness of foundation models across different organs and the fairness issues, which finds the shortcomings of the current foundation models and plays an important role in guiding future improvements of foundation models.

Bookmark

Evaluation of effectiveness and fairness of foundation models in multi-organ segmentation

Key Points

Abstract

Cite This Study

Also Consider

Also Consider