What is the clinical evidence from this study?

Study design: Other. Population: Epileptic Seizure (n=11500). Intervention: LightGBM vs. Other machine learning models (Logistic Regression, Random Forest, XGBoost, CatBoost). Primary outcome: Accuracy.

April 5, 2026Open Access

Benchmarking Lightweight Machine Learning Models for Epileptic Seizure Recognition: Accuracy, Calibration, and Robustness Analysis

Q: What does this research mean for the field?

Lightweight machine learning models, specifically LightGBM, achieve highly accurate (98.04%) and well-calibrated performance for epileptic seizure recognition on precomputed EEG features, demonstrating competitive results with deep learning models at a fraction of the computational cost. Novelty: ClaimNovelty.INCREMENTAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

Key Result

LightGBM achieved 98.04% accuracy and a ROC-AUC of 0.9971 for epileptic seizure recognition on precomputed EEG features, demonstrating competitive performance with deep learning models.

Key Points

To evaluate the performance of lightweight machine learning models for epileptic seizure recognition, focusing on accuracy, calibration, and robustness.
Conducted a benchmark analysis on various lightweight models including Logistic Regression, Random Forest, XGBoost, LightGBM, and CatBoost.
Evaluated performance using metrics such as accuracy, macro-F1 score, ROC-AUC, and PR-AUC.
Assessed confidence calibration through Brier scores and reliability diagrams.
Tested robustness under Gaussian feature perturbations.
LightGBM achieved 98.04% accuracy and a ROC-AUC of 0.9971.
Demonstrated a Brier score of 0.0166, indicating excellent calibration.
Gradient boosting methods consistently outperformed Logistic Regression, highlighting the importance of nonlinear features.
Achieved competitive performance compared to deep learning models, with reduced computational costs.

Structured PICO

Population

11,500 samples from the Epileptic Seizure Recognition dataset (derived from Bonn University EEG corpus), with 178 precomputed EEG-derived features per sample, formulated as a binary classification task (20% seizure vs 80% non-seizure).

Intervention

Lightweight machine learning models (LightGBM, Random Forest, XGBoost, CatBoost, Logistic Regression)

Comparator

Logistic regression baseline and prior deep learning approaches

Outcome

Classification performance (Accuracy, Macro-F1, ROC-AUC, PR-AUC), confidence calibration (Brier score), and robustness under Gaussian feature perturbationssurrogate

Lightweight gradient boosting models like LightGBM achieve competitive accuracy compared to deep learning on precomputed EEG feature datasets, while offering superior efficiency, calibration, and robustness.

Limitations

Evaluated on a single dataset; generalization to other EEG datasets with different feature extraction pipelines should be verified.
Robustness evaluation uses synthetic Gaussian noise, which may not fully capture the structure of real-world clinical artifacts.
Does not evaluate temporal sequence models on raw EEG signals.
Perturbation analysis shows zero degradation, which may indicate that the tested noise levels were too small relative to the feature scale.
Class imbalance (20% seizure vs. 80% non-seizure) may affect the generalizability of the calibration analysis.

Abstract

Epileptic seizure recognition is a critical task in clinical decision support systems, where both accuracy and reliability of predictions directly affect patient outcomes. While deep learning architectures such as CNNs and LSTMs are widely applied to EEG-based seizure detection, many publicly available seizure datasets consist of precomputed EEG-derived features, making the problem fundamentally tabular rather than raw-signal based. In such settings, the necessity and added value of complex deep learning pipelines remain unclear, and prior studies have largely emphasized classification accuracy while giving more limited attention to calibration, robustness, and deployment efficiency. In this work, we present a systematic benchmark of lightweight machine learning models—Logistic Regression, Random Forest, XGBoost, LightGBM, and CatBoost—on the Epileptic Seizure Recognition dataset. We evaluate performance across multiple dimensions: discriminative ability (accuracy, macro-F1, ROC-AUC, PR-AUC), confidence calibration (Brier score, calibration and reliability diagrams), and robustness under Gaussian feature perturbations. Our results show that LightGBM achieves 98.04% accuracy, a ROC-AUC of 0.9971, and a Brier score of 0.0166, while maintaining stable performance under the tested noise levels. Notably, all gradient boosting methods substantially outperform Logistic Regression, indicating that nonlinear feature interactions are critical for this task. Compared with prior deep learning approaches on the same dataset, these lightweight models achieve competitive performance at a fraction of the computational cost. These findings show that tabular machine learning methods deserve serious consideration for EEG-derived feature classification tasks, particularly in resource-constrained clinical settings where efficiency, calibration, and robustness are as important as raw accuracy.

AI에게 질문

Bookmark

View Full Paper