Key points are not available for this paper at this time.
The term ``multimodal music dataset'' is often used to describe music-related data that represent music as a multimedia art form and multimodal experience. However, the term ``multimodality'' may mean different things in related disciplines, such as musicology, music psychology, and music technology. This paper proposes a definition of multimodality that works across different music disciplines. Many challenges are related to constructing, evaluating, and using multimodal music datasets. We provide a task-based categorization of these datasets and explore theoretical methodologies aimed at enhancing their future construction. Diverse data preprocessing methods are illuminated, highlighting their contributions to transparent music analysis. Additionally, evaluation metrics, methods, and benchmarks tailored for multimodal music processing tasks are scrutinized, empowering researchers to make informed decisions and facilitating cross-study comparisons.
Christodoulou et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: