Deep vision foundation models such as DINOv3 offer strong visual representation capacity, but their direct deployment in medical image segmentation remains difficult due to the limited availability of annotated clinical data and the computational cost of full fine-tuning. This study proposes an adaptation framework called StrDiSeg that integrates lightweight bottleneck adapters between selected transformer layers of DINOv3, enabling task-specific learning while preserving pretrained knowledge. An attention-enhanced U-Net decoder with multi-scale feature fusion further refines the representations. Experiments were performed on two publicly available ischemic stroke lesion segmentation datasets—AISD (Non Contrast CT) and ISLES22 (DWI). The proposed method achieved Dice scores of 0.516 on AISD and 0.824 on ISLES22, outperforming baseline models and demonstrating strong robustness across different clinical imaging modalities. These results indicate that adapter-based fine-tuning provides a practical and computationally efficient strategy for leveraging large pretrained vision models in medical image segmentation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Qiong Chen
Dawei Zhang
Yiqun Chen
Bioengineering
Huazhong University of Science and Technology
Universidade Estadual de Campinas (UNICAMP)
Union Hospital
Building similarity graph...
Analyzing shared references across papers
Loading...
Chen et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6975b26ffeba4585c2d6de3f — DOI: https://doi.org/10.3390/bioengineering13020133