Although convolutional neural networks (CNNs) have achieved remarkable success in image classification tasks, their inherent limitation of fixed receptive fields restricts their ability to model long-range semantic dependencies. To address this challenge, we propose a novel network architecture, Adaptive Multi-Scale Residual Attention Network (AMSRA-Net), which integrates multi-scale local features with global self-attention mechanisms. AMSRA-Net is composed of four cascaded hierarchical multimodal residual attention blocks (HMRABs), each incorporating a multi-scale feature decoupler (MSFD) and a lightweight gated self-attention engine (LGSA-Engine). The multi-scale feature decoupler employs a channel-splitting strategy to enable parallel extraction of features at different granularities. Building upon this, the gated self-attention engine establishes long-range dependencies across spatial locations via nonlinear transformations, dynamically suppressing redundant background information while enhancing critical semantic features. This results in a deeply synergistic mechanism that combines cross-scale feature interaction with dynamic feature calibration.Experiments conducted on the CIFAR-10 dataset demonstrate that AMSRA-Net achieves a classification accuracy of 95.89%, surpassing baseline models such as ResNet-18 (95.55%) and Compact Convolutional Transformers (CCT, 95.04%), while maintaining lower model complexity. Ablation studies further reveal significant performance drops when removing the gated self-attention engine (down to 89.25%) or degrading the multi-scale feature decoupler to single-scale convolution (down to 88.80%), validating the effectiveness of the proposed dual mechanism of “feature decoupling and dynamic fusion.” This study highlights the efficacy of combining self-attention with multi-scale convolutions and offers a new paradigm for integrating CNNs with global attention mechanisms.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shangyun Jiang
Weihai Science and Technology Bureau
Ruixuan Yu
Weihai Science and Technology Bureau
International Journal of Pattern Recognition and Artificial Intelligence
Twitter (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Jiang et al. (Thu,) studied this question.
synapsesocial.com/papers/69b5ff5c83145bc643d1bc7d — DOI: https://doi.org/10.1142/s0218001426540042