May 15, 2026Open Access

Medical Image Segmentation Methods: A Decision-Guided Survey Covering 2D/3D CNNs, Transformers, VLMs, SAM-Based Models and Diffusion Approaches

Key Points

Key points are not available for this paper at this time.

Abstract

Recent advances in medical image segmentation have introduced a wide spectrum of deep learning paradigms, including 2D/3D convolutional neural networks (CNNs), transformer-based architectures, vision-language models (VLMs), prompt-driven foundation models such as Segment Anything Model (SAM), and diffusion-based approaches. Although these methods have demonstrated remarkable performance across MRI, CT, PET, ultrasound, and endoscopic imaging, the rapid proliferation of architectures has created methodological uncertainty regarding optimal model selection under varying clinical and data constraints. Existing surveys primarily focus on architectural categorization, yet provide limited guidance for decision-oriented model selection. This study presents a comprehensive and decision-guided survey that systematically analyzes segmentation paradigms across imaging modalities, task types, dataset characteristics, and evaluation protocols. Beyond taxonomy, we propose a practical model selection framework that links clinical scenarios, such as small lesion detection, multi-organ 3D segmentation, limited-data regimes, and domain shift, to appropriate segmentation strategies. Furthermore, robustness, generalization, annotation variability, and benchmarking reproducibility are critically examined. By integrating architectural taxonomy, cross-modal comparative analysis, and a structured decision framework, this work provides a clinically oriented roadmap for selecting segmentation methods and highlights future research directions toward reliable and reproducible medical AI systems.

Medical Image Segmentation Methods: A Decision-Guided Survey Covering 2D/3D CNNs, Transformers, VLMs, SAM-Based Models and Diffusion Approaches

Key Points

Abstract

Cite This Study