What does this research mean for the field?

The multi-agent-driven framework VulFinder improves the accuracy of automated vulnerability reachability validation by up to 21% and increases efficiency by over 1.5 times compared to existing state-of-the-art tools. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to automate vulnerability reachability validation in software supply chains using the VulFinder framework.

June 10, 2026Open Access

VulFinder: A Multi‐Agent‐Driven Test Generation Framework for Guiding Vulnerability Reachability Analysis

Key Points

This research aims to automate vulnerability reachability validation in software supply chains using the VulFinder framework.
Utilized static code analysis tools to identify function call paths between applications and vulnerability APIs.
Implemented a multi-agent mechanism including distillator, discriminator, generator, and validator to generate exploit tests.
Evaluated VulFinder’s performance on Java and Python datasets derived from vulnerability data.
Achieved 21% accuracy improvement over the tool VESTA and 7% over TRANSFER on the Java dataset.
Showed robust generalizability on the Python dataset, minimizing false positives and negatives.
Delivered over 1.5 times efficiency improvement in vulnerability validation.

Abstract

Software engineering promotes the reuse of software components to accelerate the development process; however, reusing third‐party components in AI4SE’s software supply chain (SSC) can introduce vulnerability risks. Once a new vulnerability in a third‐party component emerges, developers need to determine whether the current project is affected by the vulnerability, that is, assess the vulnerability reachability issue in the SSC, which requires a significant amount of manpower and resources for assessment. To address this issue, we propose VulFinder, a multi‐agent‐driven framework for automated vulnerability reachability validation. VulFinder begins by using static code analysis tools to construct function call paths between downstream applications and dependency vulnerability APIs. Leveraging a multi‐agent mechanism comprising a distillator, discriminator, generator, and validator, VulFinder iteratively generates exploit tests for methods along the call graph, effectively validating vulnerability reachability by executing these tests on downstream applications. We evaluate the performance of VulFinder across different programming language ecosystems, which were constructed from vulnerability datasets for Java and Python. Experiments show that VulFinder achieves 21% accuracy improvement over the state‐of‐the‐art tool VESTA and a 7% accuracy improvement over the popular baseline tool TRANSFER on the Java dataset and also demonstrates robust generalizability on the Python dataset, significantly reducing false positives and false negatives and delivering an average efficiency improvement of over 1.5 times. We recommend that researchers and practitioners use VulFinder in practice, which integrates the code comprehension capabilities of large language models (LLMs) with the multi‐agent framework to reduce false alarms and missed alarms in the reachability of vulnerabilities in SSC.

VulFinder: A Multi‐Agent‐Driven Test Generation Framework for Guiding Vulnerability Reachability Analysis

Key Points

Abstract

Cite This Study