Blockchain technology has emerged as a transformative innovation, redefining industries through its decentralized and secure framework. Smart contracts—self-executing code deployed on blockchain platforms like Ethereum—enable decentralized applications (dApps) to automate processes across finance, healthcare, and supply chain management. However, their programmability introduces significant security risks, making them susceptible to vulnerabilities that can be exploited by malicious actors. While different detection methods have been developed to address these security concerns, they often remain inadequate. Traditional approaches to smart contract vulnerability detection, such as static and dynamic analysis, are limited by their reliance on predefined rules, making them ineffective for addressing complex, domain-specific vulnerabilities in rapidly evolving decentralized ecosystems. This thesis addresses these challenges by leveraging Large Language Models (LLMs), which have demonstrated exceptional capabilities in contextual understanding and reasoning. Through parameter-efficient fine-tuning techniques, including Low-Rank Adaptation (LoRA) and Quantized LoRA (QLoRA), the research enhances the scalability and accessibility of LLMs for vulnerability detection. The study also examines Retrieval-Augmented Generation (RAG) frameworks to dynamically retrieve and process relevant information. The research develops and evaluates two distinct approaches: fine-tuning LLMs and RAG. The CodeGemma 7B model achieved exceptional results, attaining 94.78% accuracy on the DeFi Hacks & Top200 dataset and 92.52% on the TrustLLM dataset, surpassing previous benchmarks using larger and proprietary models. The best-performing RAG model, Gemma 2, achieved 79.1% accuracy, demonstrating the effectiveness of retrieval-based augmentation. These contributions lay the groundwork for more scalable, efficient, and democratized tools for securing blockchain ecosystems, addressing the limitations of traditional methods while offering cost-effective solutions for safer decentralized systems.
Ελένη Φ. Μανδάνα (Wed,) studied this question.