Recent advances in machine learning and deep learning have demonstrated the applicability and utility of cross-lingual, transfer learning methods in low and zero-resource scenarios. We explore the applicability of transfer learning methods from pre-trained models in zero-shot and few-shot scenarios for part-of-speech tagging. We report the results of an ablation study to understand the impact of training data size in low-resource languages on the system’s performance. Since building or augmenting datasets for low-resource languages is tricky, costly and a lot of time not feasible, the study provides valuable insights into the expected relative data requirements for both the high-resource language (the source language for transfer) and the low-resource language and the kind of performance boost one could expect when one is planning to use transfer learning for low-resource languages. The study is conducted with Hindi as the high-resource language and the three related languages - Magahi, Bhojpuri and Braj - as extremely low-resource languages. Overall, the study addresses four broad research questions: (a) How much data in the low-resource as well as high-resource language is “sufficient” for attaining optimum performance in a downstream task like part-of-speech annotation, and is there any specific advantage for low-resource language if we use multilingual data during fine-tuning? (b) Do different multilingual pre-trained models, specifically multilingual-BERT, multilingual-DistilBERT, XLM-RoBERTa, and MuRIL, offer any significant advantage in terms of dataset requirements for attaining an optimum performance in Indian languages? (c) In the case of multiple closely-related low-resource languages, does distributing the dataset across multiple languages result in a performance comparable to that of a system trained on a single language? (d) What is the impact of the typological similarity of the languages on the dataset requirement for successful transfer learning?
Raj et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: