What question did this study set out to answer?

This research aims to develop a dataset for Arab music improvisation and explore its machine translation capabilities.

March 29, 2026Open Access

Arab music improvisation corpus for research (AMICOR): development and machine translation experiments

Key Points

This research aims to develop a dataset for Arab music improvisation and explore its machine translation capabilities.
Developed the AMICOR dataset with vocal improvisation phrases and instrumental responses.
Integrated musicological insights for evaluating theoretical differences in music.
Experimented with neural and statistical machine translation methods for generating instrumental responses.
Analyzed individual models for each maqam versus a unified model for all maqamat.
Merging related sub-datasets does not guarantee improved performance.
Neural MT showed greater challenges in translating improvisational music compared to Statistical MT.
Performance varies significantly based on the musical backgrounds of dataset users.

Abstract

Under-resourced languages (and musics) pose a challenge to machine translation (MT). The challenge is greater when the content of the collected dataset is a varied sample taken from a data population that is even more diverse and dynamic. This is the challenge of Arab music vocal improvisation (mawwal). Here, we present the development of AMICOR, a parallel dataset consisting of vocal improvisatory phrases and their corresponding instrumental responses (or tarjamat in Arabic, which literally means “translations”) in the mawwal tradition. These melodic phrases are handled as “sentences” from the viewpoint of natural language. When developing the dataset, we integrated musicological insights in order to evaluate music theoretical differences between sub-datasets, primarily regarding their size, sentence length, performance quality, and shared musical identity. We then experimented with MT to generate instrumental responses to new vocal sentences, comparing several translation modeling configurations that differ (1) in translation approach (Neural MT (NMT) versus Statistical MT (SMT)), and (2) in the dataset handling approach in respect to the maqam (an Arabic musical term referring roughly to a melodic mode), comparing an individual model for each maqam versus a unified model for all maqamat. We found that merging related sub-datasets does not necessarily lead to better results, and may even favor simpler and shorter sentences with lower performance quality and less sophisticated patterns. This issue applies to both NMT and SMT; however, it is greater for NMT. A comparison of confusion matrices of individual-maqam models suggested that, in such a small dataset, the gap between SMT and NMT performance increases further if the styles, or skills, of potential users differ from those who built the dataset used in the training. Our discussion asserts that key factors in system design are the musical background and performance decisions of vocalists who may use such responsive generative models, as well as dataset size and performance quality.

Bookmark

View Full Paper

Bookmark

View Full Paper

Arab music improvisation corpus for research (AMICOR): development and machine translation experiments

Key Points

Abstract

Cite This Study