What does this research mean for the field?

GPT-4.1 can effectively evaluate revision patterns in young students' argument writing, with improved accuracy using Chain-of-Thought prompting. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The aim is to explore how AI can evaluate revision patterns in students' argument writing using various prompting strategies.

March 12, 2026Open Access

AI-assisted evaluation of revision patterns in young students’ argument writing

Key Points

The aim is to explore how AI can evaluate revision patterns in students' argument writing using various prompting strategies.
Examined students' drafts through AI-assisted formative assessments
Utilized GPT-4.1 to predict revision patterns
Compared effectiveness of few-shot prompting and few-shot Chain-of-Thought prompting
GPT-4.1 showed strong potential for evaluating revisions
Demonstrated excellent intra-rater reliability
CoT prompting improved accuracy of predicting explanation-focused revisions compared to evidence-focused revisions

Abstract

Revision is a crucial component of the writing process, yet few formative assessments focus on young students’ revision processes. This study explored an AI-assisted formative assessment that identifies revision patterns across drafts (i.e., first and second drafts) of students’ text-based argument writing. In particular, we examined the performance of GPT-4.1 in predicting revision patterns using two prompting strategies: few-shot prompting and few-shot Chain-of-Thought (CoT) prompting. The results show that GPT-4.1 exhibits strong potential for evaluating the revision process for formative purposes. It demonstrates excellent intra-rater reliability in predicting revision patterns across multiple runs. We also find that using CoT prompting that incorporates intermediate evaluation steps improves the accuracy of predicting explanation-focused revision patterns, a task that requires a more cognitively demanding evaluative process than assessing evidence-focused revisions. Implications for the conditions under which CoT prompting yields added value for enhancing prediction accuracy in writing evaluation are discussed.

Bookmark

View Full Paper

Bookmark

View Full Paper

AI-assisted evaluation of revision patterns in young students’ argument writing

Key Points

Abstract

Cite This Study