What question did this study set out to answer?

The aim is to develop an intelligent grading system that combines syntactic and semantic analysis for short answers and cloze questions.

March 28, 2026Open Access

Hybrid Semantic–Syntactic NLP Framework for Intelligent Grading of Short Answers and Cloze Questions

Key Points

The aim is to develop an intelligent grading system that combines syntactic and semantic analysis for short answers and cloze questions.
Developed a hybrid NLP framework integrating syntactic and semantic components.
Utilized MPNet embeddings for semantic similarity evaluation.
Employed a fine-tuned DeBERTa regressor for continuous score prediction.
Provided feedback generation using a T5-small model.
Evaluated on benchmark datasets and a domain-specific corpus.
Achieved 91% accuracy in grading.
Reported a 0.89 F1 score and mean absolute error of 0.36.
Demonstrated strong inter-rater agreement (κ = 0.87).
Effectively recognized paraphrased responses and assigned partial credit.
Ablation studies highlighted the importance of each framework component.

Abstract

The increasing demand for scalable and fair assessment of open-form responses in digital education shows the need for intelligent grading systems capable of balancing syntactic precision with semantic understanding. This study proposes a hybrid semantic–syntactic NLP framework for automated grading of short-answer and cloze-type questions. The framework integrates a rule-based matcher for syntactic accuracy, MPNet (Masked and Permuted Pre-trained Network) embeddings for semantic similarity, and a fine-tuned DeBERTa (Decoding-enhanced Bidirectional Encoder Representations from Transformer with Disentangled Attention) regressor for continuous score prediction, while a T5-small model provides pedagogically aligned feedback generation. Evaluations were conducted using benchmark corpora, synthetic cloze datasets, and a domain-specific short-answer corpus. Results demonstrate that the hybrid system outperforms traditional baselines, achieving 91% accuracy, a 0.89 F1 score, a mean absolute error of 0.36, and strong inter-rater agreement (κ = 0.87), aligning closely with human graders. Qualitative analyses show that the framework successfully recognizes paraphrased responses, assigns partial credit, and generates meaningful feedback. Ablation studies further validate the necessity of each subsystem, with performance significantly declining when components were removed. The findings confirm that the proposed framework is both computationally robust and pedagogically valuable, establishing a foundation for scalable, interpretable, and fair automated grading in contemporary educational environments.

Hybrid Semantic–Syntactic NLP Framework for Intelligent Grading of Short Answers and Cloze Questions

Key Points

Abstract

Cite This Study