What question did this study set out to answer?

To develop an automatic framework for detecting infantile spasms using synchronized video and EEG data.

March 12, 2026Open Access

MST-HGCN: A multimodal spatio-temporal hypergraph convolutional network for infantile spasms detection

Key Points

To develop an automatic framework for detecting infantile spasms using synchronized video and EEG data.
Constructed a hypergraph framework combining video and EEG data.
Divided segments into windows and grouped data into anatomical and cortical regions.
Applied Focal Loss and dynamic-margin triplet loss to address class imbalance.
Tested the model through five-fold cross-validation with a dataset of 1,358 segments.
Achieved 99.19% accuracy, 98.02% precision, and 98.82% recall in cross-validation.
In independent testing, the model reached 89.12% accuracy and 81.82% F1-score.
Demonstrated a significant reduction in missed detections and false alarms compared to single-modality methods.

Abstract

We propose the Multimodal Spatio-Temporal Hypergraph Convolutional Network (MST-HGCN), a unified framework for automatic infantile spasm detection using synchronized video and EEG recordings. Each 5-second segment is divided into ten 0.5-second windows, within which video and EEG nodes are constructed and fused through synchronous hyperedges. The video skeleton is partitioned into five anatomical limb regions, while sixteen EEG electrodes are grouped into five cortical regions to form aggregated nodes. Temporal hyperedges link adjacent windows. To address class imbalance, the training objective combines Focal Loss with a dynamic-margin triplet loss. The dataset consists of 1,358 five-second segments from synchronized video-EEG recordings of 30 infants, enabling accurate detection of spasms and non-spasms across modalities.Under five-fold cross-validation, the fusion model with the detector enabled achieves 99.19% accuracy, 98.02% precision, 98.82% recall, and 98.39% F1-score. In the independent test, the model attains 89.12% accuracy, 75.27% precision, 89.74% recall, and 81.82% F1-score, substantially reducing both missed detections and false alarms compared with single-modality baselines.

Bookmark

View Full Paper

Cite This Study

Wang et al. (Mon,) studied this question.

synapsesocial.com/papers/69b2579096eeacc4fcec64d7 https://doi.org/https://doi.org/10.1007/s44443-026-00524-w

Bookmark

View Full Paper