July 9, 2024Open Access

Is GPT-4 Alone Sufficient for Automated Essay Scoring?: A Comparative Judgment Approach Based on Rater Cognition

Key Points

Key points are not available for this paper at this time.

Abstract

Large Language Models (LLMs) have shown promise in Automated Essay Scoring (AES), but their zero-shot and fewshot performance often falls short compared to state-of-the-art models and human raters.However, fine-tuning LLMs for each specific task is impractical due to the variety of essay prompts and rubrics used in real-world educational contexts.This study proposes a novel approach combining LLMs and Comparative Judgment (CJ) for AES, using zeroshot prompting to choose between two essays.We demonstrate that a CJ method surpasses traditional rubric-based scoring in essay scoring using LLMs.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Kim et al. (Tue,) studied this question.

synapsesocial.com/papers/68e60e42b6db6435875a0f97 https://doi.org/https://doi.org/10.1145/3657604.3664703

Bookmark

View Full Paper