The EMelodyGen system mainly focuses on melody generation in ABC notation controlled by emotional conditions. To overcome the scarcity of emotional labeled sheet music, we utilize statistical correlations derived from small-scale symbolic music datasets with emotion labels and music psychology conclusions to guide subsequent feature extraction, emotional control and automatic annotation. We then automatically annotate a large, well-structured sheet music collection with rough emotional labels, convert the annotated dataset into ABC notation format, and apply data augmentation to address label imbalance, resulting in the creation of a dataset named Rough4Q. We demonstrate that our system backbone pre-trained on Rough4Q can achieve up to 99% music21 parsing rate. Our emotional control parameters, categorized into directly modifiable, embedding, dual-stage, and guidance features, can be selected and assembled to design customized emotional control templates that can lead to a 91% alignment in emotional expression in blind listening tests. Ablation studies further validate the impact of these control conditions on emotional accuracy.
Zhou et al. (Tue,) studied this question.