February 22, 2024Open Access

Dependency Annotation of Ottoman Turkish with Multilingual BERT

Key Points

Key points are not available for this paper at this time.

Abstract

This study introduces a pretrained large language model-based annotation methodology for the first dependency treebank in Ottoman Turkish. Our experimental results show that, iteratively, i) pseudo-annotating data using a multilingual BERT-based parsing model, ii) manually correcting the pseudo-annotations, and iii) fine-tuning the parsing model with the corrected annotations, we speed up and simplify the challenging dependency annotation process. The resulting treebank, that will be a part of the Universal Dependencies (UD) project, will facilitate automated analysis of Ottoman Turkish documents, unlocking the linguistic richness embedded in this historical heritage.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Özateş et al. (Thu,) studied this question.

synapsesocial.com/papers/68e781fab6db6435876f55a8 https://doi.org/https://doi.org/10.48550/arxiv.2402.14743

Bookmark

View Full Paper