Music Emotion Recognition (MER) is a computational field of affective computing and audio signal processing. Although previous attempts used traditional machine learning algorithms, e.g., Support Vector Machines and k-Nearest Neighbors, for emotion classification in music, these methods are often challenged by the intricate temporal and spectral nature of sound signals, thereby constraining classification performance. To address this, the work proposes a new model, the Lion-Optimized CNN-BiGRU for Emotion Recognition (LOCBER), leveraging the strengths of Convolutional Neural Networks and Bidirectional Gated Recurrent Units for effective feature extraction and temporal sequence modeling. For additional performance optimization, LOCBER is optimized using the Lion Swarm Optimization (LSO) algorithm, which adjusts model parameters to minimize loss and achieve better accuracy and convergence. This study aims to improve music emotion recognition by integrating a CNN–BiGRU model with Lion Swarm Optimization, achieving 97.7% classification accuracy and outperforming conventional methods by 15.3%. LSO-based optimization achieved improved convergence and stability compared to traditional training processes. The LOCBER model benefits emotion-sensitive music recommendation systems, intelligent user interfaces, and music therapy applications. Experiments demonstrate that by fusion of state-of-the-art neural architectures with bio-inspired optimization, one can substantially improve the performance and validity of MER systems for real-world use.
Mingming Wu (Fri,) studied this question.