November 30, 2025Open Access

Tool learning with language models: a comprehensive survey of methods, pipelines, and benchmarks

Key Points

Tool learning enhances capabilities of language models, enabling better decision-making with external interfaces.
Reinforcement learning is a key methodology, emphasizing efficient tool selection and task execution for improved responses.
The framework consists of task planning, tool selection, task execution, and response generation, integrating distinct learning approaches.
Challenges in safety and generalization suggest further developments are essential for effective tool-augmented AI systems.

Abstract

Abstract Tool learning has emerged as a key capability for enhancing the reasoning and decision-making abilities of large language models (LLMs) by enabling them to interface with external tools such as application programming interfaces (APIs), search engines, and calculators. This survey provides a systematic overview of the tool learning paradigm, focusing on how LLMs can decompose complex tasks, select appropriate tools, invoke them correctly, and generate coherent responses. We summarize a unified four-stage framework comprising task planning, tool selection, task execution, and response generation, which captures the core processes underlying tool-augmented language modeling. We further analyze major learning methodologies, including tuning-free methods, supervised fine-tuning methods, and reinforcement learning, and discuss how they contribute to different stages of the tool-use pipeline. The survey also reviews recent benchmarks designed to evaluate tool-use competence across both general and domain-specific scenarios. Finally, we highlight open challenges in safety, efficiency, and generalization, and outline promising directions for future research. This work aims to serve as a conceptual roadmap and practical reference for researchers and developers working on tool-augmented artificial intelligence (AI) systems.

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper