Key points are not available for this paper at this time.
Student-generated questions (SGQs) have proven to be a meaningful learning tool, fostering advanced thinking skills in students and aiding teachers in understanding student learning progress. However, grading the quality of SGQs demands significant effort from teachers. In this study, we explore the suitability of large language models in evaluating SGQs and identify which models can effectively replace expert evaluation of practical teaching problems. We devised a five-dimension scale, using expert ratings as the gold standard, and employed Kendall's W consistency analysis to systematically compare different large language model evaluations against expert ratings from six aspects of the scale. The research confirmed the applicability of large language models (LLMs) for the evaluation of SGQs and the exceptional performance of ChatGPT 4.0, which can assist experts in evaluating SGQs. This study aims to facilitate the implementation of artificial intelligence generated content (AIGC) in education and reinforces the belief in the substantial potential of large language models for future applications and research in the field of education.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mi et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68e72cd4b6db6435876a6258 — DOI: https://doi.org/10.1109/iceit61397.2024.10540914
Zejia Mi
Kangkang Li
Jiangsu Normal University
Building similarity graph...
Analyzing shared references across papers
Loading...