August 5, 2025

Implementation of Music Emotion Classification using Deep Learning

Key Points

Main finding: The study introduced a deep learning system that classifies music into three emotional categories: Angry, Happy, and Sad.
Key evidence: The CNN-GRU model achieved a high accuracy of 99.10% when evaluating the emotional classification of music.
Approach: This work utilized audio files from YouTube, applying techniques like MFCC and spectral contrast for feature extraction.
Significance: By enhancing music emotion classification, this research contributes to effective therapeutic applications and personalized recommendations.

Abstract

Music plays a crucial role in shaping emotions and experiences, making its classification an important area of research with applications in therapy, recommendation systems, and affective computing. This study develops a deep learning-based system to classify music into three emotional categories: "Angry," "Happy," and "Sad." The dataset, consisting of 22 audio files collected from YouTube, was manually labelled, segmented into 30-second clips, and augmented using pitch shifting and time stretching to enhance diversity. Features were extracted using Mel-Frequency Cepstral Coefficients (MFCC) and spectral contrast to analyse the harmonic and timbral characteristics of the audio. Three deep learning models, CNN, CNN-LSTM, and CNN-GRU, were evaluated. CNN-GRU achieved the highest weighted accuracy of 99.10%, demonstrating superior performance. Future work includes adding more emotion categories, diversifying the dataset, exploring advanced architectures like transformers, optimising hyperparameters, implementing real-time applications, and conducting user studies to assess effectiveness. This research successfully developed and evaluated a music emotion classification system, contributing to advancements in the field.

KI fragen

Bookmark

View Full Paper