Accurate brain tumor diagnosis depends on reliable MRI classification. Manual interpretation is time-consuming and depends on the reader’s experience, so automated methods are favored. CNNs perform well in medical imaging but struggle with long-range context, while transformers, using self-attention, model global dependencies and have shown success in neuroimaging. This study compares CNNs and a Swin-Tiny Transformer for classifying brain tumors from contrast-enhanced T1 MRI slices, using internal and external tests to evaluate generalization. We used the publicly available Brain Tumor MRI dataset (7,023 axial T1-CE slices; four classes: glioma, meningioma, pituitary tumor, and no tumor) as the internal dataset and the BRISC-2025 dataset (6,000 slices with the same four labels) for external validation. All images were converted to 3-channel 224 × 224 inputs, normalized, and augmented with spatial transformations; class imbalance was addressed using inverse-frequency class-weighted cross-entropy loss. A custom Pure CNN, a residual CNN (Res-CNN), and a Swin-Tiny Transformer were trained with the AdamW optimizer and evaluated using accuracy, precision, recall, F1-score, macro-AUC, confusion matrices, fivefold stratified cross-validation on the internal dataset, and external testing on the BRISC-2025 dataset. On the internal test split, the Pure CNN achieved 58.73% accuracy (F1-score 57.11%, macro-AUC 0.9092), and the Res-CNN achieved 60.95% accuracy (F1-score 57.02%, macro-AUC 0.8988), with both models frequently misclassifying glioma and pituitary slices. In contrast, the Swin-Tiny Transformer achieved 99.01% test accuracy, an overall F1-score of 99.01%, and a macro-AUC of 0.9997, with only minimal residual errors, predominantly involving glioma–meningioma confusion. Five-fold cross-validation yielded a mean Swin-Tiny accuracy of 99.33% ± 0.17 (macro-AUC 0.9997), and external evaluation on BRISC-2025, without fine-tuning, resulted in 97.55% accuracy and a macro-AUC of 0.9993, indicating robust generalization across datasets acquired with different imaging characteristics. The Swin-Tiny Transformer substantially outperformed the custom CNN baselines, delivering very high internal and strong external accuracy with well-balanced per-class metrics in 4-class brain tumor MRI classification. These findings suggest that Swin-based architectures have considerable potential as robust decision-support tools for brain tumor imaging, although the slice-level nature of public datasets, absence of patient-level stratification, and higher computational cost of transformer models highlight the need for patient-level, multimodal, and prospective validation before routine clinical deployment.
Zakavi et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: