When users intend to input their musical ideas into an automatic composition or composition support system, a method for inputting ideas is necessary. However, inputting musical or chord names is typically challenging. Thus, this study proposes singing as an intuitive method for users to input their intended melodies into a system. However, the inherent instability of sung pitch renders it difficult to estimate users’ intended melodies from merely estimating the fundamental frequency. To develop a system that estimates users’ intended melodies, this study first analyzes the dynamic properties of pitch trajectories in amateur singing. Singing data from 98 individuals in the JVS-Music corpus are compared with ideal pitches indicated on musical scores, and the pitch instability of singing is investigated. Additionally, techniques for pitch smoothing and quantization into musical pitch are examined. The results show that pitch is particularly unstable at vocal onset and that overshoot is suppressed following large melodic leaps. Additionally, gender-based differences are observed in terms of pitch stability, thus suggesting that user-specific vocal characteristics should be considered in models. We believe that incorporating these identified dynamic properties will enable the development of more robust models for inferring a user's intended melody from their singing.
Nakayama et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: