ABSTRACT The Mamba model performs excellently in natural image processing but faces limitations in analyzing whole slide images (WSIs) for cancer prediction and subtype classification in digital pathology—pathological images feature highly irregular lesion spatial distributions (especially complex small‐lesion associations), while Mamba's inherent unidirectional/limited‐direction scanning cannot effectively model such multi‐dimensional spatial dependencies, failing to capture key pathological structural features. To address this, we propose HCSMIL, a Mamba‐based optimized framework tailored to pathological image clinical analysis. It comprehensively captures local lesion spatial topology via multi‐directional contextual modeling and integrates a multi‐scale pyramid structure to extract global lesion distribution features, jointly enhancing diagnostic accuracy. Validation on authoritative datasets (Camelyon16, TCGA‐LUNG, TCGA‐Kidney) shows HCSMIL significantly outperforms existing mainstream methods: on TCGA‐LUNG, accuracy (ACC), F1 score, and AUC are 0.66%, 1.42%, and 1.25% higher than the second‐best method; on TCGA‐Kidney, these metrics increase by 1.47%, 0.09%, and 1.00%; on Camelyon16, ACC is 0.77% higher. Notably, HCSMIL achieves an 84% small‐lesion recognition rate, substantially exceeding TransMIL (70.59%) and MambaMIL (64.71%), fully demonstrating its strength in capturing complexly distributed lesions and providing reliable technical support for cancer diagnosis.
Qiu et al. (Thu,) studied this question.