What question did this study set out to answer?

To develop and evaluate a deep learning framework for automated brain stroke classification from CT images.

March 28, 2026Open Access

Deep ensemble learning with convolutional and transformer models for brain stroke classification

NTNoel TuryamuhakiHoly Cross College ASAsiyath SameenaCentral University of Kerala EAEmmanuel AhishakiyeKyambogo University

Key Points

To develop and evaluate a deep learning framework for automated brain stroke classification from CT images.
Combined convolutional neural network (CNN) and vision transformer (ViT) for feature extraction.
Utilized a lightweight feed-forward network for feature fusion.
Evaluated the model on a publicly available brain CT dataset with a fixed train-validation-test split.
Achieved 99.77% accuracy and 99.24% precision in classification.
Demonstrated 100% recall and an F1-score of 99.62%.
Confusion matrix showed zero false negatives and only one false positive.

Abstract

Stroke is a major cause of mortality and long-term disability worldwide, and rapid diagnosis is critical for timely treatment. Computed tomography (CT) imaging is widely used for initial stroke assessment, yet manual interpretation can be time-consuming and dependent on specialist availability. This study proposes a hybrid deep learning framework that combines a convolutional neural network (CNN) and a vision transformer (ViT) for automated stroke classification from brain CT images. The CNN captures localized spatial features while the ViT models global contextual relationships, and their feature representations are fused using a lightweight feed-forward network. The model was evaluated on a publicly available brain CT dataset using a fixed train–validation–test split. The proposed ensemble achieved an accuracy of 99.77%, precision of 99.24%, recall of 100.00%, F1-score of 99.62%, and an AUC of 0.9999. Confusion matrix analysis showed zero false negatives and one false positive in the test set. Training curves demonstrated stable convergence, and interpretability methods (LIME and occlusion sensitivity) highlighted image regions influencing predictions. Although the results indicate strong performance, the dataset represents a controlled and curated environment and does not fully capture the variability of real clinical imaging. Therefore, the reported accuracy should be interpreted as benchmark performance rather than definitive clinical diagnostic capability. Future work will involve multi-centre validation using hospital-acquired data and expert clinical evaluation. The findings suggest that CNN–ViT feature fusion is a promising approach for computer-aided stroke screening and may support radiologists in prioritizing suspicious cases after appropriate clinical validation.

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper

Cite This Study

Turyamuhaki et al. (Thu,) studied this question.

synapsesocial.com/papers/69c771518bbfbc51511e141b https://doi.org/https://doi.org/10.1007/s44163-026-01077-7

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper