What question did this study set out to answer?

This study aims to enhance music emotion recognition and generation by addressing limitations of existing methods.

May 29, 2026Open Access

Deep Neural Network-Based Music Emotion Recognition and Generation System

Key Points

This study aims to enhance music emotion recognition and generation by addressing limitations of existing methods.
Developed the EMOGEN system using a bidirectional Long Short-Term Memory (Bi-LSTM) network with an attention mechanism.
Incorporated emotion conditioning in generative models for coherent music creation.
Tested the system against baseline models for accuracy and emotional coherence.
Achieved 94.9% accuracy in emotion recognition.
Generated music with 97.6% emotional coherence compared to baseline models.
Demonstrated 90% genre adaptability and 88.1% cultural sensitivity, with a 7.2% error rate.

Abstract

Music Emotion Recognition (MER) and generation are rapidly evolving areas in affective computing, enabling systems to understand and produce music aligned with human emotions. However, existing methods often struggle with capturing long-term temporal dependencies and context-specific emotional nuances, especially when processing complex audio and lyrical content. Additionally, they cannot often generalize effectively across diverse musical genres and emotional spectrums. To address these limitations, we propose EMOGEN : Emotion-aware Music recognition and Generation using Enhanced Neural networks . EMOGEN employs a bidirectional Long Short-Term Memory (Bi-LSTM) network combined with an attention mechanism to extract and emphasize key emotional features from musical audio signals. The framework also incorporates emotion conditioning for generative models, enabling the creation of emotionally coherent music. The proposed system is designed for use in emotion-aware music recommendation platforms and therapeutic music generation, adapting music outputs based on user moods or targeted emotional outcomes. Experimental results demonstrate that EMOGEN significantly improves emotion recognition accuracy and generates music that is more consistent with target emotions than baseline models. This highlights its potential to enhance user experience and emotional well-being through adaptive, intelligent music systems. EMOGEN achieves 94.9% accuracy, 97.6% emotional coherence, 90% genre adaptability, 88.1% cultural sensitivity, a lowest 7.2% error rate, and the fastest 4.0s processing time, demonstrating exceptional robustness and efficiency.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper