Software engineering promotes the reuse of software components to accelerate the development process; however, reusing third‐party components in AI4SE’s software supply chain (SSC) can introduce vulnerability risks. Once a new vulnerability in a third‐party component emerges, developers need to determine whether the current project is affected by the vulnerability, that is, assess the vulnerability reachability issue in the SSC, which requires a significant amount of manpower and resources for assessment. To address this issue, we propose VulFinder, a multi‐agent‐driven framework for automated vulnerability reachability validation. VulFinder begins by using static code analysis tools to construct function call paths between downstream applications and dependency vulnerability APIs. Leveraging a multi‐agent mechanism comprising a distillator, discriminator, generator, and validator, VulFinder iteratively generates exploit tests for methods along the call graph, effectively validating vulnerability reachability by executing these tests on downstream applications. We evaluate the performance of VulFinder across different programming language ecosystems, which were constructed from vulnerability datasets for Java and Python. Experiments show that VulFinder achieves 21% accuracy improvement over the state‐of‐the‐art tool VESTA and a 7% accuracy improvement over the popular baseline tool TRANSFER on the Java dataset and also demonstrates robust generalizability on the Python dataset, significantly reducing false positives and false negatives and delivering an average efficiency improvement of over 1.5 times. We recommend that researchers and practitioners use VulFinder in practice, which integrates the code comprehension capabilities of large language models (LLMs) with the multi‐agent framework to reduce false alarms and missed alarms in the reachability of vulnerabilities in SSC.
Zhao et al. (Thu,) studied this question.