What question did this study set out to answer?

The aim is to improve emotion recognition and generation in cello performances using generative AI techniques.

February 28, 2026

Emotion Recognition of Cello Performance Based on Generative Artificial Intelligence

Key Points

The aim is to improve emotion recognition and generation in cello performances using generative AI techniques.
Developed an emotion recognition and generation model using WavLM and performance dynamic features.
Implemented a three-stage generation module based on diffusion transformers.
Utilized joint training with a discriminator to improve accuracy and quality.
Achieved a 5.7% improvement in accuracy on the DEAM dataset.
Generated quality metrics exceeded existing models in MOS (4.12), PESQ (3.48), and FAD (2.74).
Emotional expression accuracy reached 81.6%.
Discriminator module yielded a Macro-F1 of 0.762 with an MSE of 0.0207.

Abstract

This study proposes an emotion recognition and generation model for cello performance oriented toward intelligent music education and emotional interaction. By fusing the Waveform-based Language Model (WavLM) self-supervised model, performance dynamic features (volume, rhythm, glissando), and a three-stage generation module based on diffusion transformers, the model's capabilities in emotion recognition and audio generation are enhanced. Combined with a discriminator and a joint training mechanism, the model achieves a 5.7% improvement in Accuracy and a 0.069 increase in Macro-F1 on the Database for Emotional Analysis in Music (DEAM) dataset. The generation module outperforms existing models in metrics including Mean Opinion Score (MOS) (4.12), Perceptual Evaluation of Speech Quality (PESQ) (3.48), and Fréchet Audio Distance (FAD) (2.74), with an emotional expression accuracy of 81.6%. The discriminator module achieves a Macro-F1 of 0.762 and an MSE of 0.0207. The joint training strategy significantly improves generation quality, with a Generated Quality Index (GQI) of 0.39. Results indicate that multimodal fusion and diffusion modeling effectively enhance the understanding and generation quality of musical emotions.

AI에게 질문

Bookmark

AI에게 질문

Bookmark

Emotion Recognition of Cello Performance Based on Generative Artificial Intelligence

Key Points

Abstract

Cite This Study