What question did this study set out to answer?

This work aims to analyze the effectiveness of agent fine-tuning through knowledge distillation in specialized microdomains.

February 22, 2026

Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains

Key Points

This work aims to analyze the effectiveness of agent fine-tuning through knowledge distillation in specialized microdomains.
Fine-tuning of LLMs using JP1-specific training corpora.
Incorporated distilled reasoning trajectories such as ReAct and CoT.
Utilized retrieval-augmented generation with an agentic prompt.
Introduced a context-answer extractor for improved relevance.
Achieved 13% improvement for Engineer, 12% for Professional, and 10% for Consultant on JP1 examinations.
Delivered up to 9.78 times higher cost efficiency compared to general-purpose LLMs like GPT-4.
Demonstrated the scalability and effectiveness of agent fine-tuning for technical microdomains.

Abstract

Agentic large language models (LLMs) have emerged as powerful tools for autonomously interacting with external environments and performing multi-step reasoning. While most existing approaches rely on in-context learning with multi-turn few-shot prompts, these methods often require long inputs and, consequently, incur high computational costs and latency. Agent fine-tuning offers a resource-aware alternative by enabling models to internalize procedural reasoning patterns and domain-specific knowledge through demonstrations and curated training data. However, its effectiveness in highly specialized technical microdomains remains underexplored. This work investigates agent fine-tuning with knowledge distillation for adapting LLMs to Hitachi's JP1 middleware, a complex microdomain centered on IT operations management. We fine-tune models using JP1-specific corpora extracted from manuals and textbooks, together with distilled reasoning trajectories (ReAct and CoT) generated by larger LLMs (GPT-4). At inference time, we incorporate retrieval-augmented generation with an agentic prompt and introduce a context-answer extractor to improve grounding and relevance. On JP1 certification examinations, our model that was continually pre-trained on JP1 maunals and further fine-tuned with ReAct trajectories achieves substantial improvements over the base model-13% (Engineer), 12% (Professional), and 10% (Consultant)-while delivering up to 9.78 times higher cost efficiency than strong general-purpose LLMs such as GPT-4. These results demonstrate that agent fine-tuning combined with knowledge distillation is a highly effective and economically scalable strategy for building high-fidelity LLM agents tailored to specialized technical domains.

Bookmark

Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains

Key Points

Abstract

Cite This Study

Also Consider

Also Consider