What question did this study set out to answer?

The research aims to improve software fault localization by developing a method that balances defect knowledge and reduces noise in test cases.

April 4, 2026Open Access

Augmenting Automated Spectrum-Based Multi-Fault Localization via Borderline Confident Learning

Key Points

The research aims to improve software fault localization by developing a method that balances defect knowledge and reduces noise in test cases.
Developed the SA-BCL method with four key components: data processing, defect knowledge balancing, confident learning denoising, and spectrum reduction.
Constructed program spectrum through analysis of program behaviors and test results.
Employed boundary identification and oversampling techniques to address defect knowledge imbalance.
Utilized confident learning techniques to identify and eliminate noise in the spectrum.
SA-BCL outperforms baseline approaches, improving average wasted effort by up to 37.5%.
Achieved a 29.2% increase in precision and a 22.5% increase in recall compared to existing methods.
Demonstrated statistically significant improvements while maintaining similar time costs to baseline methods.

Abstract

Software fault localization aims to identify faulty elements in a program by analyzing program information and the execution data of test cases. This process plays a crucial role in improving development efficiency, reducing debugging costs, and ensuring reliable software operation. However, in practical scenarios, due to the large scale of programs and the relatively small proportion of faulty statements, the number of failing test cases in a test suite is lower than that of passing test cases to some extent, leading to an imbalance in defect knowledge. Additionally, mutual interference and masking among multiple faults in a program can introduce characterizing noise into the test suite, further increasing the difficulty of fault localization. To address these limitations, we propose a novel method called SA-BCL, which aims to tackle the issues of defect knowledge imbalance and characterizing noise. SA-BCL comprises four key components: the data processing component, the defect knowledge balancing component, the confident learning denoising component, and the program spectrum reducing component. Specifically, the data processing component constructs the program spectrum by analyzing program behaviors and test results during the execution of test cases. The defect knowledge balancing component mitigates the imbalance of defect knowledge by employing boundary identification and oversampling techniques to augment the spectrum corresponding to failing test cases located at the boundary of the test suite. The confident learning denoising component leverages confident learning techniques to identify and eliminate characterizing noise in the program spectrum. The program spectrum reducing component iteratively computes the suspiciousness scores of program elements and reduces the spectrum to facilitate fault localization. We conduct a series of experiments on both the synthetic multi-fault dataset and the real multi-fault dataset, comparing SA-BCL with baseline approaches. The experimental results clearly demonstrate that SA-BCL outperforms the state-of-the-art methods, achieving improvements of up to 37.5% in average wasted effort, 29.2% in precision, and 22.5% in recall. In addition, SA-BCL maintains comparable time costs within the same order of magnitude as baseline methods, and most of the improvements are statistically significant, demonstrating its robustness.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper