Accurate vulnerability detection is essential to prevent potential security breaches and protect software systems from malicious attacks. Recently, vulnerability detection approaches leveraging deep learning and large language models (LLMs) have garnered increasing attention. However, existing approaches often focus on analyzing individual files or functions, limiting their ability to detect complex and inter-procedural vulnerabilities. Analyzing entire repositories to gather context introduces significant noise and computational overhead. To address these challenges, we propose a context-enhanced approach for vulnerability detection by combining program analysis with LLMs. This method extracts contextual information at multiple abstraction levels to filter out noise, and feeds both the abstracted context and source code into an LLM. Our goal is to strike a balance between providing sufficient detail to accurately capture vulnerabilities and minimizing unnecessary complexity that could hinder model performance. Based on an extensive study using GPT, DeepSeek, and CodeLLaMA with various prompting strategies, our key findings include: First, incorporating abstracted context significantly enhances vulnerability detection effectiveness; Second, different models benefit from distinct levels of abstraction depending on their code understanding capabilities; Third, capturing program behavior through program analysis for general LLM-based code analysis tasks can be a direction that requires further attention.
Yang et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: