April 14, 2024Open Access

Flakiness Repair in the Era of Large Language Models

YCYang ChenUniversity of Illinois Urbana-Champaign

Key Points

Key points are not available for this paper at this time.

Abstract

Flaky tests can non-deterministically pass or fail regardless of any change to the code, which negatively impacts the effectiveness of the regression testing. Prior repair techniques for flaky tests mainly leverage program analysis techniques to mitigate test flakiness, which only focus on Order-Dependent (OD) and Implementation-Dependent (ID) flakiness with known flakiness patterns and root causes. In this paper, we propose an approach to repair flaky tests with the power of Large Language Models (LLMs). Our approach successfully repaired 79% of OD tests and 58% of ID tests in an extensive evaluation using 666 flaky tests from 222 projects. We submitted pull requests to fix 61 flaky tests; at the time of submission, 19 tests have already been accepted. However, we observed that currently LLMs are ineffective in adequately repairing Non-Order-Dependent (NOD) flaky tests by analyzing 118 of such tests from 11 projects.

Ask AI

Helpful

Bookmark

View Full Paper

Cite This Study

Yang Chen (Sun,) studied this question.

synapsesocial.com/papers/68e6f3b2b6db64358766e933 https://doi.org/https://doi.org/10.1145/3639478.3641227

Ask AI

Helpful

Bookmark

View Full Paper