What type of study is this?

This is a Experimental Study study.

October 8, 2025Open Access

Advanced Implementation of a Multilevel Model for Text Summarization in Kazakh Using Pretrained Models

Key Points

The proposed multilevel model significantly enhances text summarization quality in the Kazakh language.
mBART outperformed other models, achieving the highest scores in ROUGE and BERTScore metrics.
The model integrates features across linguistic levels, capturing both lexical variation and contextual dependencies.
This work presents the first implementation of a multilevel architecture for hybrid summarization in a low-resource language.

Abstract

This study investigates transformer models for the task of hybrid text summarization in the Kazakh language. Using mBART, mT5, and XLM-RoBERTa models, a multilevel architecture was developed that processes text at the character, subword, word, and contextual levels. The proposed system performs feature fusion across multiple linguistic layers, enabling the model to capture both fine-grained lexical variation and broader contextual dependencies. The architecture also allows flexible integration with various transformer models, supporting both encoder-decoder and hybrid configurations. This approach significantly improved the quality of generated summaries by effectively accounting for the morphological and semantic features of the Kazakh language. The experimental results showed that mBART achieved the best performance in terms of ROUGE-1, ROUGE-2, ROUGE-L, and BERTScore-F1 metrics, confirming the high effectiveness of the proposed multilevel transformer architecture. This is the first implementation of such an architecture for hybrid summarization in Kazakh, which is a low-resource and morphologically rich language.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper