What question did this study set out to answer?

This thesis investigates the capabilities of large language models in NL2SQL tasks and explores methods to optimize their performance.

May 8, 2026Open Access

Bridging the gap between NL2SQL models and users : A VSCode plugin with verified SQL generation

Key Points

This thesis investigates the capabilities of large language models in NL2SQL tasks and explores methods to optimize their performance.
Experimental validation of NL2SQL performance using 3-shot DAIL-SQL setting with large language models.
Implementation of benchmark error corrections and rule-based validation for accuracy enhancement.
Development of a VS Code plugin for integrating optimization strategies.
Model execution accuracy improved from 80.9% to 87.5% after applying corrections and validations.
Demonstrated significant performance improvements suggest better capabilities of LLMs in NL2SQL tasks.
Identified user deployment gaps that the VS Code plugin addresses.

Abstract

This thesis explores the Natural Language to SQL (NL2SQL) task from three perspectives: few-shot selection, database optimization, and bench mark error detection with rule-based validation. A key focus is to investigate whether the performance of large language models (LLMs) has been under estimated in NL2SQL tasks. We argue that this underestimation mainly stems from three factors: (1) LLM hallucination, errors in benchmark gold answers and limitations in evaluation systems. Experimental results show that under the original 3-shot DAIL-SQL setting, the model achieves an execution accuracy of 80.9%. After applying a series of corrections, including benchmark error fixing and rule-based validation, the accuracy improves significantly to 87.5%. These findings suggest that LLMs are substantially more capable in NL2SQL tasks than previously perceived. Furthermore, we identify a gap between LLM-based models and real-world deployment scenarios. To bridge this gap, we extend DAIL-SQL by developing a practical VS Code plugin that integrates our optimization strategies, enabling more robust and user-friendly NL2SQL applications in real-world settings.

Bookmark

View Full Paper

Bookmark

View Full Paper

Bridging the gap between NL2SQL models and users : A VSCode plugin with verified SQL generation

Key Points

Abstract

Cite This Study