This study investigates transformer models for the task of hybrid text summarization in the Kazakh language. Using mBART, mT5, and XLM-RoBERTa models, a multilevel architecture was developed that processes text at the character, subword, word, and contextual levels. The proposed system performs feature fusion across multiple linguistic layers, enabling the model to capture both fine-grained lexical variation and broader contextual dependencies. The architecture also allows flexible integration with various transformer models, supporting both encoder-decoder and hybrid configurations. This approach significantly improved the quality of generated summaries by effectively accounting for the morphological and semantic features of the Kazakh language. The experimental results showed that mBART achieved the best performance in terms of ROUGE-1, ROUGE-2, ROUGE-L, and BERTScore-F1 metrics, confirming the high effectiveness of the proposed multilevel transformer architecture. This is the first implementation of such an architecture for hybrid summarization in Kazakh, which is a low-resource and morphologically rich language.
Оралбекова et al. (Mon,) studied this question.