Key points are not available for this paper at this time.
Despite previous endeavors to utilize Convolutional Neural Networks and Transformers as base networks for medical image analysis, their architectures still harbor inherent limitations: either an inability to model long-range dependencies or colossal computational consumption due to global self-attention. Recently, State Space Models (SSMs) have exhibited impressive capabilities in modeling long-term dependencies with satisfactory linear computational complexity. Nevertheless, extant medical visual SSMs are constrained by their limited capacity to capture inter-patch relationships and inefficient modeling due to the introduction of additional depth convolutions to handle high-dimensional data. In this paper, we propose a novel, Pure Visual State Space Model (PV-SSM) for high-dimensional medical data analysis. Different from prior medical visual SSMs, our proposed framework does not involve any convolutional or global attention operations while leverages a series of Pure-SSM blocks that employ a novel parallel-SSM mechanism to simultaneously extract feature data across different dimensions. Furthermore, we propose a learnable Parameterized Positional Encoding, which incorporates absolute positional information into patch features, effectively endowing inter-patch relationships with stronger inferential capabilities. We conducted extensive validation on various modalities of medical imaging data. Experimental results demonstrate superior performance and efficacy of our model against existing models. Our codes are available at https://github.com/chengwang96/PV-SSM
Wang et al. (Tue,) studied this question.
Synapse has enriched 2 closely related papers on similar clinical questions. Consider them for comparative context: