What question did this study set out to answer?

This research aims to develop a machine learning framework for accurately classifying emotions in music for therapeutic use.

April 28, 2026Open Access

Emotion Classification in Music: Leveraging Machine Learning for Music Therapy and Emotional Response Analysis

Key Points

This research aims to develop a machine learning framework for accurately classifying emotions in music for therapeutic use.
Leveraged the Emotify dataset for training and testing the model.
Implemented six experimental modeling phases with Random Forest, MLP, and XGBoost alongside ensemble methods.
Final model utilized a stacking ensemble approach with a logistic regression meta-classifier.
Achieved a subset accuracy of 0.41, Hamming loss of 0.25, and a macro-averaged F1 score of 0.67.
The proposed model outperformed previous multi-label methods based on the Emotify dataset.
Highlighted the effectiveness of ensemble strategies and multi-modal feature fusion in emotion classification.

Abstract

Emotion classification in music is an evolving domain within affective computing that holds significant promise for applications in music therapy, personalized media experiences, and adaptive systems. This study proposes a comprehensive machine learning framework for multi-label emotion recognition in music, leveraging the publicly available Emotify dataset. The framework incorporates acoustic features, listener metadata (e.g., mood, age, gender), and genre classification to enhance predictive accuracy. We conducted six experimental modeling phases using Random Forest, Multi-Layer Perceptron (MLP), XGBoost, and their ensemble variants. The final model—a stacking ensemble combining the three base learners with a logistic regression meta-classifier—achieved superior results with a subset accuracy of 0.41, Hamming loss of 0.25, and macro-averaged F1 score of 0.67. Comparative analysis with recent studies indicates that our approach achieves higher macro-F1 performance than prior clip-level, multi-label methods evaluated on the Emotify dataset under comparable settings. These results highlight the critical role of ensemble strategies and multi-modal feature fusion in modeling complex emotional landscapes. The proposed model not only advances the state-of-the-art but also supports the development of emotionally adaptive systems and highlights promising potential for future music-therapy applications, subject to further usability and clinical-effectiveness evaluation.

Ask AI

Helpful

Bookmark

View Full Paper