What question did this study set out to answer?

This research aims to develop a language model-based system for assessing the correctness of patches generated by automated program repair tools.

June 12, 2026Open Access

LLM4PatchCorrect: A Language Model-Based Patch Correctness Assessment Tool for Automated Program Repair

Key Points

This research aims to develop a language model-based system for assessing the correctness of patches generated by automated program repair tools.
LLM4PatchCorrect employs an explain-execute-suggest framework for patch assessment.
Extensive evaluation was conducted using the QuixBugs dataset with a focus on accuracy, explanation quality, and suggestion effectiveness.
A parallel processing architecture was implemented to reduce analysis time by 50% compared to sequential processing.
Achieved 87.3% accuracy in patch classification based on experimental data.
Generated expert-rated explanations with a score of 4.2/5 for clarity and utility.
Successful patch suggestions were provided in 70% of evaluated cases.

Abstract

1 AbstractAutomated Program Repair (APR) tools have demonstrated significant potentialin generating patches for software bugs, yet they suffer from the critical overfittingproblem where generated patches pass existing test cases without addressing theunderlying issues. Current patch correctness evaluation methods heavily depend onlabeled data from specific APR tools, severely limiting their generalizability andrequiring substantial manual verification effort. We present LLM4PatchCorrect, anovel language model-based system that automates patch correctness assessmentwithout requiring tool-specific training data or fine-tuning. Our approach employs an innovative explain-execute-suggest framework that provides comprehensive patchclassification, detailed explanations for incorrect patches, and actionable improvementsuggestions. Through extensive experimental evaluation on the QuixBugs dataset, LLM4PatchCorrect achieves 87.3% accuracy in patch classification, generates high-quality explanations rated 4.2/5 by experts, and provides successful patch suggestions in 70% of cases. The integrated parallel processing architecture reduces analysis timeby 50% compared to sequential processing, while the web-based interface ensurespractical accessibility for real-world deployment.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper