What type of study is this?

September 10, 2025

From Assistants to Agents: The Evolution of Large Language Models in Data Science Workflows

Key Points

Autonomous agents can enhance performance in data science workflows, improving efficiency and effectiveness in analysis and research.
Key innovations in large language models address limitations like workflow rigidity and lack of cross-domain adaptability.
Frameworks like R&D-Agent and Agent Laboratory illustrate the transition from assistive tools to intelligent agents in data science.
Future priorities include domain-specific customization and standardized evaluation for next-gen autonomous data analysis systems.

Abstract

This paper presents a comprehensive overview of the evolution of data science from a statistics-centric discipline to a machine learning–driven field, culminating in the current integration of large language models (LLMs). It identifies key limitations in traditional LLM applications—such as limited cross-domain adaptability, lack of interpretability, and workflow rigidity—and explores recent innovations addressing these challenges. Three representative frameworks—R&D-Agent, SPIO, and Agent Laboratory—illustrate LLMs’ transition from assistive tools to autonomous agents capable of planning, executing, and optimizing entire data science workflows. These systems leverage dual-agent cooperation, modular architectures, and self-correcting capabilities to improve performance in end-to-end data analysis and scientific research. The paper concludes by outlining future priorities, including domain-specific customization, standardized agent evaluation, and improved interpretability, all of which are essential for the next generation of intelligent, autonomous data science systems.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xinyou Yin

Journals

Advances in Engineering Technology Research

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

From Assistants to Agents: The Evolution of Large Language Models in Data Science Workflows

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study