Revision is a crucial component of the writing process, yet few formative assessments focus on young students’ revision processes. This study explored an AI-assisted formative assessment that identifies revision patterns across drafts (i.e., first and second drafts) of students’ text-based argument writing. In particular, we examined the performance of GPT-4.1 in predicting revision patterns using two prompting strategies: few-shot prompting and few-shot Chain-of-Thought (CoT) prompting. The results show that GPT-4.1 exhibits strong potential for evaluating the revision process for formative purposes. It demonstrates excellent intra-rater reliability in predicting revision patterns across multiple runs. We also find that using CoT prompting that incorporates intermediate evaluation steps improves the accuracy of predicting explanation-focused revision patterns, a task that requires a more cognitively demanding evaluative process than assessing evidence-focused revisions. Implications for the conditions under which CoT prompting yields added value for enhancing prediction accuracy in writing evaluation are discussed.
Li et al. (Mon,) studied this question.