To enhance pop music creation, this study proposes an automatic accompaniment generation method combining sliding window technology with the MuseFlow model.The sliding window segments long music sequences into short-time overlapping frames, balancing time and frequency resolution to capture local signal characteristics.MuseFlow employs an enhanced bidirectional mapping architecture and training objectives to accurately model complex relationships in multi-track music data.Experimental results show that MuseFlow achieves Fréchet inception distance (FID) scores of 26.3 on the POP909 dataset and 25.4 on the FreeMidi dataset, significantly outperforming baseline models.These findings demonstrate that the proposed method generates high-quality, diverse accompaniments compatible with main melodies, providing an efficient tool for music creators.
Yue Wang (Thu,) studied this question.