What question did this study set out to answer?

The aim is to develop a framework (StableMIL) that addresses aggregation challenges in whole slide images by considering morphological variability.

April 11, 2026

StableMIL: Entropy-Stabilized Attention-based Multiple Instance Learning for Morphologically Variable Whole Slide Images

Key Points

The aim is to develop a framework (StableMIL) that addresses aggregation challenges in whole slide images by considering morphological variability.
Proposed StableMIL framework incorporating an entropy-stabilized attention mechanism.
Utilized Randomly Projected 2D rotary position embedding for spatial representation.
Conducted theoretical and experimental analyses on nine diverse WSI datasets.
Evaluated performance in both classification and survival prediction tasks.
StableMIL consistently outperforms baseline methods, particularly in survival prediction tasks.
Demonstrated improvements across all evaluated cancer types and morphological conditions.
Achieved robust aggregation performance despite variability in patch distribution and sequence length.

Abstract

Aggregating features of tens of thousands of patches into Whole Slide Images (WSIs) representations via aggregators is a crucial step in computational pathology. However, existing aggregation strategies overlook the morphological variability of tissue regions in WSIs stemming from differences in clinical procedures and tumor characteristics, leading to two critical limitations: 1) attention collapse in long sequences caused by significant variation in patch numbers across WSIs (ranging from thousands to tens of thousands per WSI); 2) attention misallocation due to under-trained positional embeddings resulting from the non-uniform spatial coordinates introduced by irregular patch distributions. Consequently, current attention-based methods struggle to generalize across this morphological variability, resulting in inconsistent aggregation performance and compromised model reliability in clinical settings. To address these issues, we propose a Entropy-Stabilized Attention-based Multiple Instance Learning (StableMIL) framework, which incorporates an entropy-stabilized attention mechanism to ensure consistent aggregation across WSIs with varying patch numbers and a Randomly Projected 2D rotary position embedding to enhance spatial representation robustness across irregular patch distributions. Extensive theoretical and experimental analyses on nine WSI datasets spanning diverse cancer types, across both classification and survival prediction tasks, demonstrate that StableMIL effectively overcomes the challenges of handling long instance sequences and out-of-distribution spatial coordinates. Our framework consistently outperforms representative baselines, particularly in survival prediction, with stable improvements observed across all evaluated cancer types and morphological scenarios, highlighting its potential for real-world clinical applications. Our source code is available at https://github.com/theeeqi/stableMIL.

Mark Helpful

Bookmark

Relay