The ViT-2DCNN spatio-temporal fusion model achieved an accuracy of 97.95%, sensitivity of 98.36%, and specificity of 97.55% for classifying interictal and preictal states in epilepsy.
Does a spatio-temporal fusion model (ViT-2DCNN) improve the accuracy of epileptic seizure prediction from EEG signals?
The ViT-2DCNN model demonstrates high accuracy and robustness for epileptic seizure prediction by fusing time-frequency and spatial EEG features.
OBJECTIVE: Epilepsy is a chronic neurological disorder characterized by recurrent and sudden seizures. Accurate prediction of epileptic seizures holds significant clinical value by enabling timely medical intervention. Due to the patient-specific nature of EEG signals, existing models often exhibit limited performance or fail to reliably predict seizures for certain individuals. This study aims to develop a seizure prediction model that integrates time-frequency and spatial information to improve prediction accuracy and robustness. APPROACH: We propose a spatio-temporal fusion model (ViT-2DCNN) for seizure prediction. An entropy distribution map based on the EEG electrode layout is introduced as a spatial-modality input, which preserves the topological relationships among electrodes and reflects regional brain complexity. This map, together with time-frequency representations derived via short-time Fourier transform, serves as dual-branch input to the model. The architecture combines a Vision Transformer (ViT) branch to capture global time-frequency dependencies and a 2DCNN branch enhanced with multi-scale spatial attention to extract local spatial-dynamic patterns. A gated fusion module interactively integrates features from both branches for final classification between interictal and preictal states. MAIN RESULTS: Evaluated on the public CHB-MIT dataset, the proposed ViT-2DCNN model achieves an Accuracy of 97.95%, Sensitivity of 98.36%, Specificity of 97.55%, and F1-score of 97.98%. Six subjects attained 100% accuracy, and the lowest accuracy across all subjects remained above 92.37%, demonstrating high overall performance and reliable lower-bound efficacy. SIGNIFICANCE: By fusing complementary time-frequency and spatially structured entropy features, the model overcomes limitations of single-modality approaches and captures richer spatio-temporal characteristics of pre-seizure EEG. The results indicate strong potential for seizure prediction in clinical practice.
Yao et al. (Thu,) conducted a other in Epilepsy. ViT-2DCNN spatio-temporal fusion model was evaluated on Classification between interictal and preictal states. The ViT-2DCNN spatio-temporal fusion model achieved an accuracy of 97.95%, sensitivity of 98.36%, and specificity of 97.55% for classifying interictal and preictal states in epilepsy.