We propose a Dual-Channel assisted Music Emotion Perception and Visualization (DC-MEPV) framework designed for ambient lighting in intelligent vehicle cockpits, addressing the increasing demand for advanced human–machine interaction in the automotive industry. This framework consists of three main components: the Multi-Scale Feature Extraction Block (MSFEB), the Global Sequence Modeling Block (GSMB), and the Emotional Color Visualization Algorithm (ECV-Algo). The MSFEB extracts valence and arousal (V-A) features from dual channels at multiple temporal scales, with each channel employing a hybrid neural network architecture to capture multi-scale emotional representations. The GSMB integrates positional encoding, bidirectional long short-term memory (BiLSTM) networks, and multi-head self-attention mechanisms to dynamically model global emotional sequences. The ECV algorithm utilizes personalized emotion–color association rules to achieve expressive emotion-driven lighting visualization based on a continuous mapping from emotion space to color space. We conducted comprehensive comparison and ablation experiments to evaluate the model’s emotion perception performance, and designed three metrics to evaluate the quality of the generated visualizations. The model outperformed other networks in both comparative and ablation experiments. Additionally, the generated lights demonstrated strong performance in terms of CIEDE2000 variation rates, unique color ratios, and joint histogram entropy. DC-MEPV achieved excellent performance in emotion perception and visualizations on the DEAM and PMEmo datasets.
Shen et al. (Mon,) studied this question.