February 21, 2026Open Access

UGR-MINDVOICE: A multimodal EEG-audio dataset for overt and covert Iberian Spanish speech production

Key Points

Key points are not available for this paper at this time.

Abstract

We present UGR-MINDVOICE, the University of Granada (UGR) multimodal electroencephalography (EEG) and audio dataset for overt and covert speech in Iberian Spanish intended for basic neuroscience and brain-computer interface (BCI) research. The dataset features EEG and audio recordings from 15 native Spanish speakers engaged in both overt and covert speech production tasks. This dataset is unique in its inclusion of all Spanish phonemes and a diverse set of words spanning various semantic categories and different usage frequencies. Validation of the dataset confirmed the presence of robust sensory event-related potentials, including the visual P100 and the auditory N1 (N100), indicating reliable early perceptual processing and sustained participant attention to both visual and auditory stimuli. Additionally, the EEG data were classified into rest, covert speech, and overt speech conditions with an accuracy of 81.40%, demonstrating active participant engagement in the tasks. By providing synchronised EEG and audio data for overt speech, along with EEG data for the same stimuli during covert speech, UGR-MINDVOICE constitutes a valuable resource for advancing research in basic neuroscience and brain-computer interfaces, particularly in the domain of silent speech communication. The full dataset is openly available on the Open Science Framework (OSF) ( https://osf.io/6sh5d ), and all accompanying code and analysis scripts are provided in a public GitHub repository ( https://github.com/owaismujtaba/mind-voice ). • UGR-MINDVOICE is the first open-access EEG-audio dataset designed specifically for Iberian Spanish, capturing both overt and covert speech across all Spanish phonemes and a wide range of lexical items. • The dataset supports the development of brain-computer interfaces for silent speech decoding, offering a non-invasive alternative to intracranial methods and enabling communication for individuals with severe motor impairments. • With high-density EEG, multimodal stimuli (text, audio, and images), and carefully controlled tasks, UGR-MINDVOICE enables robust modelling of speech production and perception.

UGR-MINDVOICE: A multimodal EEG-audio dataset for overt and covert Iberian Spanish speech production

Key Points

Abstract

Cite This Study