What question did this study set out to answer?

This paper aims to develop an adaptive music generation model that integrates improved variational autoencoders for enhanced control and emotion-based outputs.

May 11, 2026Open Access

Adaptive music generation by integrating improved VAE and improved GMVAE

Key Points

This paper aims to develop an adaptive music generation model that integrates improved variational autoencoders for enhanced control and emotion-based outputs.
Employs a controllable variational autoencoder (C-VAE) for decoupling structure and control latent variables.
Incorporates transformer-XL to model long-term dependencies.
Utilizes a semantically guided modified variational autoencoder (S-GMVAE) for embedding mode-emotion relationships.
Achieves an F1 score of 93.76% and style matching of 91.84% on MAESTRO and LMD datasets.
Maintains a coherence score of 90.16% even with 30% missing notes.
Subjective evaluations yield a melody fluency, emotional authenticity, and semantic consistency rating exceeding 4.6/5.

Abstract

This paper innovatively proposes an adaptive target-music generation model: it employs a controllable variational autoencoder (C-VAE) to construct decoupled structure/control latent variables, incorporates transformer-XL for modelling long-term dependencies, and combines a semantically guided modified variational autoencoder (S-GMVAE) to embed mode-emotion relationships into the latent space for controllable generation.On the MAESTRO and LMD datasets, the model achieves F1 = 93.76%and style matching = 91.84%.It maintains coherence = 90.16%even at 30% missing notes while exhibiting the lowest generation latency.Subjective evaluations reveal melody fluency, emotional authenticity, and semantic consistency all exceeding 4.6/5.Compared to PRNN, POP909-BART, MTR-VAE, and others, the model excels in both accuracy and real-time performance.Results from the experiment demonstrate that the proposed framework offers significant advantages in emotion-controlled style transfer and robust generation under missing information, providing effective support for intelligent composition, emotional soundtrack creation, and human-computer interaction music systems.

Bookmark

View Full Paper

Bookmark

View Full Paper

Adaptive music generation by integrating improved VAE and improved GMVAE

Key Points

Abstract

Cite This Study