Key points are not available for this paper at this time.
Language models based on deep learning showed promising results for artistic generation purposes, including musical generation. However, the evaluation of symbolic musical generation models is mostly based on low-level mathematical metrics (e.g., the result of the loss function) due to the inherent difficulty in measuring the musical quality of a given performance due to the subjective nature of music. This work sought to measure and evaluate musical excerpts generated by deep learning models from a human perspective, limited to the scope of classical piano music generation. In this assessment, a population of 117 people performed blind tests with musical excerpts of human composition and musical excerpts generated using artificial intelligence models, including the models PerformanceRNN, Music Transformer, MuseNet and a custom model based on GRUs. The experiments demonstrated that musical excerpts generated using models based on the Transformer neural network architecture obtained the greatest receptivity within the tested population, surpassing the results of human compositions. In addition, the experiments also demonstrated that people with greater musical sensitivity and musical experience were more able to identify the compositional origin of the excerpts heard.
Ferreira et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: