What question did this study set out to answer?

The paper aims to enhance music style transfer and audio synthesis using a novel framework based on a decoupled conditional variational autoencoder.

April 1, 2026

Research on music style transfer and audio synthesis technology based on variational autoencoder

Puntos clave

The paper aims to enhance music style transfer and audio synthesis using a novel framework based on a decoupled conditional variational autoencoder.
Developed a double-branch encoder structure for decoupling music content and style.
Implemented audio style transfer by applying the DC-VAE framework.
Utilized potential space sampling for audio synthesis aligned with target style conditions.
Introduced losses like resistance loss and style consistency loss to improve output quality.
DC-VAE outperformed standard VAE, CVAE, and CycleGAN in style accuracy and content similarity.
Generated audio demonstrated high quality and diversity in meeting specific style requirements.
Improvements in naturalness and fidelity were validated through newly introduced loss functions.

Resumen

This paper focuses on music style transfer and audio synthesis technology, and proposes a framework based on improved Variational Autoencoder (VAE)-decoupled conditional VAE (DC-VAE). The framework realizes the decoupling modeling of music content and style through the double-branch encoder structure, and supports independent control of the core content and style characteristics of music. In terms of style transfer, DC-VAE can efficiently transfer the style of one piece of audio to another piece of audio, while keeping the core content of the original music unchanged; In the aspect of audio synthesis, by sampling from the potential space and combining with the target style conditions, we can generate brand-new audio that meets the specific style requirements and has diversity. Experiments show that DC-VAE is superior to the baseline models such as standard VAE, CVAE and CycleGAN in style accuracy, content similarity and audio quality. In addition, the naturalness and fidelity of the generated audio are further improved through the introduction of resistance loss, style consistency loss and content-style mutual information minimization loss, which verifies its effectiveness and superiority in the fields of music style transfer and audio synthesis.

Me gusta

Guardar

Me gusta

Guardar

Research on music style transfer and audio synthesis technology based on variational autoencoder

Puntos clave

Resumen

Cite This Study