Computational notebooks are increasingly used in the fields of data science, computer science, classrooms, the software industry, and various fields. However, users often encounter errors, bugs, and vulnerabilities related to modularized code, unexecuted cells, and outdated library versions. This paper presents a tool, Bugspyter, designed to detect and repair code bugs in Jupyter Notebooks using LLMs. We develop an agent-based model to test the performance of the LLM in identifying the bug types and the root causes of these bugs in the notebooks, along with enhancing the model with static analysis results. Our results show that Bugspyter can identify buggy notebooks and has a high accuracy for identifying implementation bug types in notebooks. Additionally, it can identify coding errors as the root cause of bugs in a notebook but fails to perform well in other root causes of bugs. Furthermore, we see an improvement in the performance of LLMs when identifying bugs in executed notebooks but no change in performance with the inclusion of static analysis results. This study contributes valuable insights into enhancing the reliability of computational notebooks, as it helps to reduce the need for many manual evaluations to fix these issues.
Oluwadabira Omotoso (Thu,) studied this question.