Agentic large language models (LLMs) have emerged as powerful tools for autonomously interacting with external environments and performing multi-step reasoning. While most existing approaches rely on in-context learning with multi-turn few-shot prompts, these methods often require long inputs and, consequently, incur high computational costs and latency. Agent fine-tuning offers a resource-aware alternative by enabling models to internalize procedural reasoning patterns and domain-specific knowledge through demonstrations and curated training data. However, its effectiveness in highly specialized technical microdomains remains underexplored. This work investigates agent fine-tuning with knowledge distillation for adapting LLMs to Hitachi's JP1 middleware, a complex microdomain centered on IT operations management. We fine-tune models using JP1-specific corpora extracted from manuals and textbooks, together with distilled reasoning trajectories (ReAct and CoT) generated by larger LLMs (GPT-4). At inference time, we incorporate retrieval-augmented generation with an agentic prompt and introduce a context-answer extractor to improve grounding and relevance. On JP1 certification examinations, our model that was continually pre-trained on JP1 maunals and further fine-tuned with ReAct trajectories achieves substantial improvements over the base model-13% (Engineer), 12% (Professional), and 10% (Consultant)-while delivering up to 9.78 times higher cost efficiency than strong general-purpose LLMs such as GPT-4. These results demonstrate that agent fine-tuning combined with knowledge distillation is a highly effective and economically scalable strategy for building high-fidelity LLM agents tailored to specialized technical domains.
Xue et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: