What question did this study set out to answer?

The research aims to enhance segmentation accuracy of ocular structures in multi-modal medical images using a novel fine-tuning framework.

April 4, 2026Open Access

SAM-FGF: SAM fine grained fine-tuning for ophthalmic image segmentation

Key Points

The research aims to enhance segmentation accuracy of ocular structures in multi-modal medical images using a novel fine-tuning framework.
Developed the SAM-FGF framework for fine-tuning the Segment Anything Model (SAM).
Utilized multi-modal imaging techniques, including CFP and AS-OCT.
Incorporated Fine-Grained Fine-tuning (FGF) and cross-attention mechanisms for better feature alignment.
Applied Low-Rank Adaptation (LoRA) for efficient fine-tuning process.
SAM-FGF achieved superior segmentation performance compared to existing models.
Demonstrated effective segmentation across diverse datasets and imaging modalities.
Showed improved accuracy in distinguishing ocular structures and lesions.

Abstract

With increasing reliance on multi-structural analysis in ophthalmic diagnosis and treatment, accurate segmentation of ocular structures and lesions is essential for effective clinical decision-making. Multi-modal imaging, such as color fundus photography (CFP) and anterior segment optical coherence tomography (AS-OCT), provides complementary views of the posterior and anterior segments, enabling comprehensive disease assessment and personalized treatment planning. However, significant modality differences hinder the generalization ability of existing segmentation models. Although the Transformer-based Segment Anything Model (SAM) demonstrates strong zero-shot performance on natural images, it struggles with medical images exhibiting inter-modal variations. To address this, we propose SAM Fine-Grained Fine-tuning (SAM-FGF), a framework for multi-modal, multi-target ophthalmic image segmentation. SAM-FGF incorporates a Fine-Grained Fine-tuning (FGF) module that employs cross-attention mechanisms to dynamically align and contrast input images with multi-modal feature representations, thereby extracting modality-adaptive features. These refined features serve as inputs to the HQ-Decoder, improving segmentation accuracy across diverse medical imaging tasks. In addition, we incorporate Low-Rank Adaptation (LoRA) to enable efficient fine-tuning while preserving structural details. Experiments on multiple datasets demonstrate that SAM-FGF achieves superior segmentation performance across diverse ophthalmic imaging modalities.

Bookmark

View Full Paper

Cite This Study

Liang et al. (Thu,) studied this question.

synapsesocial.com/papers/69d0af9a659487ece0fa59b5 https://doi.org/https://doi.org/10.1007/s44443-026-00633-6

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper