What question did this study set out to answer?

The aim is to improve efficiency and accuracy in medical image segmentation models currently limited by computational resource demands.

May 30, 2026

Vision mamba augmented segment anything model for medical image segmentation

Key Points

The aim is to improve efficiency and accuracy in medical image segmentation models currently limited by computational resource demands.
Developed VM-MedSAM using Mamba architecture
Optimized image encoder based on RVM+ and froze prompt encoder
Validated on a medical image dataset with 12 abdominal organs
VM-MedSAM improved abdominal organ segmentation accuracy compared to MedSAM
Achieved 65.11% reduction in parameters and 3.82 times faster training speed
Decreased model size by 85.41%

Abstract

BACKGROUND: Medical image segmentation is a crucial task for accurate diagnosis and treatment, aiding in the identification of organs and lesions. While SAM has excelled in natural image segmentation, its direct application to medical images is limited due to significant feature differences. Existing models like MedSAM, despite making progress, face challenges with high computational resource consumption and insufficient accuracy in handling detailed features. PURPOSE: To address the limitations of high computational cost and insufficient segmentation accuracy in existing medical image segmentation models, this study proposes a novel model, VM-MedSAM, designed to be more efficient and precise. METHODS: Inspired by the Mamba architecture, we developed VM-MedSAM. The model incorporates a vision backbone network based on RVM+, freezes the prompt encoder, and optimizes the image encoder from MedSAM. This structural adjustment significantly reduces the number of parameters and improves training efficiency. The proposed model was validated on a medical image dataset covering 12 different abdominal organs. RESULTS: Experimental results demonstrate that VM-MedSAM achieves a slight improvement in abdominal organ segmentation accuracy compared to MedSAM, with significant improvements in lung cancer and brain tumor segmentation. Furthermore, VM-MedSAM reduced the number of parameters by 65.11%, increased training speed by 3.82 times, and decreased model size by 85.41%. CONCLUSIONS: The VM-MedSAM model effectively addresses the challenges of high computational cost and limited accuracy in existing medical image segmentation approaches. Its improved performance and efficiency make it a promising solution for medical image segmentation.

Mark Helpful

Bookmark

Relay