What question did this study set out to answer?

To explore the effectiveness of stylometric analysis in detecting plagiarism in Tatar language texts.

synapse

⌘+K

synapse

⌘+K

April 7, 2026

Stylometric Analysis in the Task of Plagiarism Detection in Texts in the Tatar Language

Key Points

To explore the effectiveness of stylometric analysis in detecting plagiarism in Tatar language texts.
Developed tools using machine learning algorithms for analysis, including k-means clustering, random forest, support vector machine, and naïve Bayes classifier.
Employed a hybrid approach utilizing the FastText model combined with logistic regression.
Adapted linguistic metrics specifically for the Tatar language.
Stylometric methods effectively detect plagiarism in Tatar texts using various machine learning techniques.
Demonstrated possibility for authorship attribution and analysis of emotional tone in writings.

Abstract

The article discusses the use of stylometric1 analysis in plagiarism detection in texts in the Tatar language. Relevant tools have been developed, utilizing machine learning algorithms, including clustering (k‑means clustering), classification (random forest method, support vector machine method, naïve Bayes classifier), and a hybrid approach (FastText model + logistic regression). Special attention is paid to the adaptation of linguistic metrics to the Tatar language. The possibility is demonstrated of using stylometric analysis methods to address tasks of authorship attribution and determination of style and emotional tone in texts in the Tatar language.

Bookmark

Cite This Study

Khayaleeva et al. (Mon,) studied this question.

synapsesocial.com/papers/69d49ecbb33cc4c35a227867 https://doi.org/https://doi.org/10.3103/s0005105525701444

Bookmark