What question did this study set out to answer?

The research aims to assess the prevalence of AI-generated content in Russian-language online news media and address detection challenges.

February 5, 2026Open Access

Assessing the Prevalence of AI-Generated Content in Leading Runet News Media

Key Points

The research aims to assess the prevalence of AI-generated content in Russian-language online news media and address detection challenges.
Systematic review of binary text classification approaches for distinguishing human-written from AI-generated texts.
Application of a supervised detection framework on news texts from 20 leading Russian online media outlets.
Fine-tuning of a neural network based on RuRoBERTa architecture for Russian-language processing.
Fragment-level classification analysis with aggregated decision rules.
Only 18% of analyzed platforms exceed the detectable threshold of 17.1% for AI-generated content.
82% of platforms fall below the threshold, indicating traditional editorial practices' dominance.
Findings contribute to discussions in digital journalism and information security.

Abstract

This article investigates the phenomenon of unmarked automatically generated content in Russian-language online news media and addresses the methodological challenges of its reliable detection. With the rapid adoption of generative artificial intelligence—particularly large language models (LLMs)—in journalistic workflows, questions of transparency, authorship, and editorial responsibility have become increasingly salient. The study reviews and systematizes existing approaches to binary text classification aimed at distinguishing between human-written and AI-generated content, with particular attention to their applicability in the Russian linguistic and media context. Empirically, the research applies a supervised detection framework to a large corpus of news texts collected from 20 leading Russian online media outlets with the highest national traffic shares between March and May 2025, selected using the SimilarWeb analytical service. The detection methodology is based on fine-tuning a neural network built on the RuRoBERTa architecture, adapted for Russian-language processing and trained on a combination of annotated corpora and controlled synthetic paraphrases of real news articles. To account for document-level heterogeneity, the analysis employs fragment-level classification followed by aggregated decision rules. The scientific novelty of the study lies in its comprehensive and reproducible approach to the quantitative assessment of unmarked AI-generated content in Russian news media. The findings indicate that only 18% of the analyzed platforms exceed the statistically significant threshold of 17.1% for AI-generated material, while the remaining 82% stay below this level, suggesting the continued dominance of traditional editorial practices. These results contribute to ongoing discussions in digital journalism, media ethics, and information security, and provide an empirical foundation for future research on AI transparency and content governance.

Assessing the Prevalence of AI-Generated Content in Leading Runet News Media

Key Points

Abstract

Cite This Study