What question did this study set out to answer?

The aim is to improve the correlation between dance and music while ensuring style consistency in generated dance segments.

May 29, 2026Open Access

Music generation controllable dance based on improved transformer model and style consistency

Key Points

The aim is to improve the correlation between dance and music while ensuring style consistency in generated dance segments.
Developed a bidirectional attention mechanism for cross-modal generation of dance and music.
Implemented a planned sampling strategy to mitigate exposure bias in autoregressive models.
Evaluated the model against mainstream models using metrics like Frechet distance, beat coverage, and style accuracy.
The generative model achieved a Frechet distance of 25.7, beat coverage of 59.7%, and a hit rate of 52.4%.
Style classification accuracy reached 95.4% and style retention rate was 90.2% in dance completion tasks.

Abstract

Music-driven dance generation can effectively improve the efficiency and popularity of artistic creation, but existing generation methods have problems such as insufficient dance-music correlation, poor stability of long sequence generation, and inconsistent styles.Therefore, a novel dance generation and completion framework that integrates improved transformer and style consistency control is proposed.This framework first constructs a bidirectional attention mechanism cross-modal generation model, enhances the correlation between dance and music through bidirectional interaction perception between music and action modalities, and adopts a planned sampling strategy to alleviate exposure bias in autoregressive generation.By extracting and integrating music features, key action features, and global dance style features, the completed dance segments ensure consistency in music synchronisation and overall style.Experiments showed that the generative model significantly outperformed mainstream comparison models in Frechet distance (25.7), beat coverage (59.7%), hit rate (52.4%), and diversity metrics.The complementary model achieved a style classification accuracy of 95.4% and a style retention rate of 90.2% in dance completion tasks.From this, the model proposed by the research can effectively improve the correlation and style consistency of generated dance, and promote the popularisation of art.

Bookmark

View Full Paper

Bookmark

View Full Paper

Music generation controllable dance based on improved transformer model and style consistency

Key Points

Abstract

Cite This Study