Convolutional neural networks (CNNs) excel at modeling local information thanks to their inductive bias but fall short in capturing long-range dependencies. To address this limitation, previous approaches have integrated the Transformer into CNNs in a serial fashion, which fails to continuously and simultaneously model both local and global information while ignoring the different importance of the two types of information at each stage. To simultaneously inherit the advantages of both approaches and avoid their respective limitations, we propose an innovative hybrid framework for medical image segmentation, called Dual-UNeXt, that generates both local and global features at different levels by combining CNN and Transformer in parallel in the encoder. Taking into account the varying importance of different branches at different stages, we propose a fusion module to reconcile four distinct forms of information and reduce semantic gaps. Our approach was tested on two publicly available datasets, and we achieved better results compared to other state-of-the-art techniques.
Wang et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: