Whole Slide Image (WSI) classification often relies on weakly supervised Multiple Instance Learning (MIL) methods to handle gigapixel-resolution images. In various MIL methods, attention-based approaches have shown great potential in modern medicine for cancer diagnosis and treatment. These approaches can model the interrelationships among instances to achieve enhanced bag representation using instance scores and thus promote bag-level classification performance. However, the existing attention-based MIL methods face two challenges: (1) The attention-based instance scores cannot accurately represent the contribution of instances to bag-level classification, making it difficult to identify the discriminative regions in WSIs. (2) Whole-slide pathological image analysis frequently suffers from model overfitting and insufficient representation of positive samples for training. To address the problem of poor discriminative regions in WSIs, we design a module to acquire the accurate contribution weights of instances by introducing the Class Activation Map suitable for WSI (WSICAM). For the second challenge, we adopt a Cross-Slide Augmentation (CSA) module to construct new samples with mixed labels on the basis of discriminative instances for model training. Our framework is composed of two WSICAM modules and one CSA module. The experimental results and visualizations demonstrate that our method achieves state-of-the-art in WSI classification on widely used datasets and exhibits robust capabilities in tumor lesion localization.
Chen et al. (Wed,) studied this question.