What question did this study set out to answer?

The aim is to address the challenges of computational complexity in medical image segmentation using a novel architecture.

March 10, 2026

MS-RWKV-UNet: Multi-Head Scan Receptance Weighted Key Value UNet for Medical Image Segmentation

Key Points

The aim is to address the challenges of computational complexity in medical image segmentation using a novel architecture.
Proposed a multi-head scan strategy to simulate spatial continuity in 2D images.
Implemented padding methods for better image representation.
Designed asymmetric convolutions in the Feature Aggregation Attention module to aggregate features effectively.
Developed panoramic token shift to model local dependencies in a wide receptive field.
Demonstrated superior performance in dense medical image segmentation tasks compared to existing methods.
Achieved better efficiency with lower computational complexity across the tested datasets.

Abstract

The Transformer has achieved great success in the field of medical image segmentation, but its quadratic computational complexity limits its application in dense medical image prediction. Recently, the receptance weighted key value (RWKV) architecture has garnered widespread attention due to its linear computational complexity and its capability of parallel computation during training. Despite the RWKV model's proficiency in addressing long-range modeling tasks with linear computational complexity, most current RWKV-based approaches employ static scanning patterns. These patterns may inadvertently incorporate biased prior knowledge into the model's predictions. To address this challenge, we propose a multi-head scan strategy combined with padding methods to effectively simulate spatial continuity in 2D images. Within the Feature Aggregation Attention (FAA) module, asymmetric convolutions are designed to aggregate 1D sequence features along a single dimension, thereby expanding effective receptive fields while preserving structural sparsity. Additionally, panoramic token shift (P-Shift) effectively models local dependency relationships by moving tokens from a wide receptive field. Extensive experiments conducted on the ISIC17/18 and ACDC datasets demonstrate that our method exhibits superior performance in dense medical image prediction tasks.

AI에게 질문

Bookmark

Cite This Study

A Fri, study studied this question.

synapsesocial.com/papers/69af954870916d39fea4ca7d https://doi.org/https://doi.org/10.1051/wujns/2026311001/pdf

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

AI에게 질문

Bookmark