What question did this study set out to answer?

The aim is to systematically leverage various forms of context to enhance AI-assisted software development tasks.

May 8, 2026Open Access

On the role of context in AI-driven program repair and test generation

Key Points

The aim is to systematically leverage various forms of context to enhance AI-assisted software development tasks.
Developed a graph-to-sequence learning approach using program analysis for capturing context.
Created a retrieval-based technique for selecting relevant demonstration examples during few-shot prompting.
Automated generation of issue-reproducing tests from natural language bug reports.
Achieved higher repair accuracy and improved test assertion generation using the graph representation approach.
Successfully generated reproducing tests for real-world issues that prior techniques failed to address.
Demonstrated substantial variation in coding agents' localization capability and repair accuracy.

Abstract

Learning-based techniques show promise for automating software development tasks, but current approaches treat context in an ad hoc manner. Existing techniques select context through arbitrary heuristics, such as fixed token windows, enclosing methods, or entire files, without systematically analyzing which contextual information is relevant for a given task. The goal of the work presented in this dissertation is to systematically leverage different forms of context to improve the effectiveness of AI-assisted software development. First, we present a graph-to-sequence learning approach that captures semantic context through program analysis. By encoding control-flow and data-flow dependencies into a fine-grained graph representation, our approach outperforms state-of-the-art baselines for program repair. Second, we develop a retrieval-based technique for selecting demonstration examples during few-shot prompting. By automatically retrieving relevant examples, our approach outperforms task-specific and fine-tuned models on test assertion generation and program repair. Third, we develop an automated technique for generating issue-reproducing tests from natural language bug reports. Our technique successfully generates reproducing tests for real-world issues, including cases uniquely solved by our approach that were missed by all prior work. Fourth, we characterize the complexity of multi-hunk patches through empirical analysis of real-world bugs. We introduce hunk divergence and spatial proximity metrics that quantify variation among hunks and dispersion across code. Our evaluation reveals that repair accuracy declines sharply with increased divergence, exposing fundamental limitations in how current models reason over dispersed code. Finally, we conduct the first automated systematic study of coding agents on multi-hunk repair. Our findings reveal substantial variation in localization capability and repair accuracy, with high-performing agents significantly outperforming lower-performing ones. Collectively, these contributions demonstrate that the choice of contextual information plays a significant role in the effectiveness of AI-assisted software development. The results show that our techniques accomplish the stated research goals.

Bookmark

View Full Paper

Bookmark

View Full Paper

On the role of context in AI-driven program repair and test generation

Key Points

Abstract

Cite This Study