Key points are not available for this paper at this time.
Breast cancer remains the most commonly diagnosed malignancy among women worldwide. Histopathological image analysis is the clinical gold standard for diagnosis; however, the high resolution and complexity of these images, together with limited annotated data, pose significant challenges for traditional deep learning methods. This study aims to develop a robust classification framework capable of effectively analyzing high-resolution histopathological images. We propose ResViT-GANNet, a novel dual-branch deep learning architecture that integrates a residual convolutional network with channel attention and a vision transformer with multi-layer token fusion. This design is specifically intended to capture both fine-grained local pathological features and long-range global semantic representations. A key novelty of our framework is the Token-Aligned Multimodal Attention (TAMA) module, which combines heterogeneous features from both branches through multi-head attention and token-wise alignment. To address limited and imbalanced data, we incorporated synthetic histopathology images generated with StyleGAN2-ADA into the training set. Extensive experiments on the BACH and BreakHis datasets demonstrate superior performance, with statistical significance confirmed through rigorous evaluation. On the BACH dataset (4-class classification), ResViT-GANNet achieved an accuracy of 96.40%, precision of 96.34%, recall of 96.36%, and an F1-score of 96.35%. These results significantly outperformed baseline methods including TransMIL (85.83%), CTransPath (88.75%), and SwinCNN (92.89%), with p-values 1.0). Incorporating synthetic data yielded an average accuracy improvement of 3.3%. On the BreakHis dataset (8-class classification across four magnification levels), the model attained an average accuracy of 98.22%, with per-class accuracies ranging from 97.25% to 99.50%. Grad-CAM visualizations further confirmed enhanced interpretability and highlighted critical histological features relevant for classification. ResViT-GANNet substantially improves classification performance on complex, high-resolution histopathology images. The major contributions of this work include a parallel dual-branch architecture enabling synergistic local–global feature learning, a token-aligned multimodal fusion mechanism, and the integration of generative augmentation with explainable AI. Together, these innovations enhance model generalization and robustness, underscoring the potential of ResViT-GANNet as a clinically useful decision-support system for breast cancer diagnosis. Not applicable.
Zhou et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: