Introduction Colorectal cancer (CRC) diagnosis from whole-slide histopathology images remains challenging due to pronounced tissue heterogeneity, multi-scale morphological variations, and the subtle nature of early neoplastic changes. While deep learning models have shown promise, conventional architectures struggle to simultaneously capture fine-grained texture cues and global architectural context, often overlooking diagnostically critical frequency-domain signatures. Methods To address these limitations, we propose CRC-Former, a novel hybrid architecture that synergistically integrates frequency-aware representation learning with efficient cross-scale sequence modeling. Specifically, CRC-Former introduces two key components: (i) a Frequency-aware Global-Local Transformer Block (FGT), which decomposes features via Haar wavelet transform and applies orientation-specific sliding-window attention in distinct subbands to enhance sensitivity to multi-directional pathological textures; and (ii) a Cross-Scale Mamba Block (CSM), which leverages selective state-space modeling to fuse hierarchical features across resolutions with linear complexity. Results Evaluated on the large-scale Chaoyang CRC dataset, CRC-Former achieves state-of-the-art performance, outperforming strong baselines. Discussion Our work demonstrates that explicit integration of signal processing priors with modern sequence modeling offers a powerful paradigm for robust, interpretable, and scalable computational pathology.
Chen et al. (Tue,) studied this question.