What question did this study set out to answer?

June 13, 2026Open Access

Does the Thesis Still Make Sense? A Comparative Analysis of Scientific Essays Generated by Humans and Generative Artificial Intelligence

Key Points

The study aims to analyze differences between human-authored and AI-generated academic essays in Hungarian.
Conducted a quantitative text analysis of essays from human authors and AI models.
Performed a blind evaluation with 391 human reviewers of varying expertise.
Assessed lexical, syntactic, and stylistic features quantitatively and qualitatively.
AI-generated essays had lower lexical diversity and lacked epistemic markers.
Human reviewers noted AI texts' formal precision but criticized their originality.
Moderate prompting led to AI essays being viewed as superior in some aspects compared to human texts.

Abstract

Although prior research indicates that expert reviewers identify AI-generated academic texts with low accuracy, the quantitative analysis presented in this paper has revealed marked, measurable differences between human-authored and AI-generated works. We investigate this duality in the context of Hungarian as an under-represented training language: on one hand, we perform a quantitative text analysis of the lexical, syntactic, and stylistic features of Hungarian-language academic essays by human authors (doctoral candidates) and those generated by Google Gemini, OpenAI GPT, and Anthropic Claude models. On the other hand, using a blind experimental design, we analyze how human reviewers (N = 391) with varying levels of expertise perceive and assess the quality of the texts. The quantitative analysis showed that AI-generated essays are characterized by lower lexical diversity and an absence of epistemic markers. The human evaluation yielded complex results: reviewers active in academic practice (members of the academically active and academically passive clusters) acknowledged the formal and logical precision of the AI-generated texts, yet they noted a lack of originality and critical depth. Reviewers less engaged with academic practice (members of the non-academic and inactive clusters), in contrast, were primarily persuaded by the more natural style and originality of the human-authored texts. The findings suggest that with moderate-level prompting and the provision of source literature, an AI-generated essay can be created in a few hours that reviewers deem superior to human work in certain aspects, such as formal and logical precision. Furthermore, our findings suggest that with targeted, more sophisticated prompt engineering, the quality gap between AI-generated and human-authored texts could narrow further. These findings have significant implications for assessment methods in higher education and for the regulation of academic publishing.

Does the Thesis Still Make Sense? A Comparative Analysis of Scientific Essays Generated by Humans and Generative Artificial Intelligence

Key Points

Abstract

Cite This Study

Also Consider

Also Consider