Music therapy has emerged as a promising, yet underutilized, treatment modality in various clinical settings. This paper presents Nightingale, a novel multimodal approach that integrates audio features with electroencephalogram (EEG) signals to predict a subject's emotional response to music. The accuracy of emotional response predictions is enhanced using audio features coupled with EEG data extracted from the Dataset for Emotion Analysis using Physiological and Audiovisual Signals (DEAP). These combined modalities are used to develop a Multilayer Perceptron and Convolutional Neural Network architecture, achieving a Mean Absolute Percentage Error (MAPE) values of 19.52 for valence, 22.16 for arousal, and an R squared of 0.67 for the two domains. This approach offers superior performance to the most recent state-of-the-art in the field. Additionally, it requires significantly less computational resources and a simple network structure, providing an evidence-based prediction technique that could make music therapy more effective and accessible.
Chen et al. (Tue,) studied this question.