The potential of Segment Anything Model 2 (SAM2) for 3D medical image segmentation via video-stream processing is currently constrained by its reliance on manual prompts. While existing research employs auxiliary models (e.g., YOLO) as prompt generators, these approaches face two fundamental limitations: the inherent bottleneck of external models’ feature extraction and the lack of mechanisms to prevent the propagation of erroneous prompts. Furthermore, current methods often struggle with interference from non-salient regions in complex 3D tumor datasets. This study aims to develop an automated, reliable prompt generation and sequence processing framework specifically for 3D medical imaging. We propose AutoPrompt-SAM3D, featuring an Automatic Prompt Generator that hierarchically integrates SAM2’s tri-layer features and a supervised confidence frames filter for reliable prompt selection. Additionally, we implement a full-sequence processing framework that progressively localizes salient regions across consecutive slices. Comprehensive experiments conducted on four public abdominal tumor datasets demonstrate that AutoPrompt-SAM3D achieves superior 3D medical segmentation performance, consistently outperforming or matching state-of-the-art prompt-based methods. AutoPrompt-SAM3D eliminates the dependency on manual prompts in SAM2-based 3D segmentation through hierarchical feature integration and error filtering. By enhancing both the reliability and efficiency of tumor localization, this framework provides a practical tool for large-scale medical image analysis and supports more consistent clinical decision-making.
Cheng et al. (Sat,) studied this question.