What question did this study set out to answer?

This research aims to improve large language models for coal mining equipment operation and maintenance (O&M) using limited data.

May 6, 2026Open Access

Vertical LLM for Coal Mining Equipment O&M Under Limited Fine-Tuning Data

Key Points

This research aims to improve large language models for coal mining equipment operation and maintenance (O&M) using limited data.
Developed a safety-guided evolutionary self-instruction method (SafeEvol-Instruct) for O&M data generation.
Proposed a hybrid fine-tuning strategy (SynergyLoRA) for specialized training of vertical-domain models.
Evaluated the constructed coal mining equipment O&M large language model (CMEOM-LLM) across various scenarios.
CMEOM-LLM showed a 4.9% improvement in accuracy and other metrics in system status assessment over the Qwen model.
In the equipment fault diagnosis scenario, accuracy improvements of 7.4% were achieved with higher BLEU-4 and ROUGE-L scores.
CMEOM-LLM achieved up to 8.5% improvements in maintenance plan formulation evaluations.

Abstract

Due to the scarcity of high-quality, specialized datasets for coal mining equipment operation and maintenance (O&M) and the poor adaptability of large language models to domain-specific scenarios, the reliability of actual mining O&M cannot be guaranteed. To address this, this paper investigates the construction of vertical-domain large language models for coal mining equipment O&M scenarios under limited fine-tuning data. First, to tackle the lack of O&M scenario data, a safety-guided evolutionary self-instruction method (SafeEvol-Instruct), is developed by integrating Self-Instruction, Evol-Instruct, and Rule-Based Filtering. This approach achieves the unified fusion of scalable generation, deep evolution, and safety filtering on limited O&M data, resulting in the construction of scenario-specific datasets for system status assessment, equipment fault diagnosis, maintenance plan formulation, and preventive maintenance. Second, to account for the distinct characteristics of different O&M tasks, a hybrid fine-tuning strategy (SynergyLoRA) is proposed based on the Qwen2.5-7B-Instruct foundation model. This strategy incorporates middle-layer LoRA, top-layer LoRA, middle-layer IA3, Prompt Tuning, and Prefix Tuning to enable specialized training of vertical-domain models for each O&M scenario. Finally, the constructed Coal Mining Equipment O&M Large Language Model (CMEOM-LLM) is evaluated through ablation studies across various scenarios, validating the effectiveness of the proposed methods. Experimental results demonstrate that, in the system status assessment scenario, CMEOM-LLM achieves improvements of 4.9%, 1.5%, and 1.4% over the Qwen model in accuracy, recall, and F1-score, respectively. In the equipment fault diagnosis scenario, CMEOM-LLM outperforms Qwen by 7.4% in accuracy, with BLEU-4 and ROUGE-L scores increasing by 6.6% and 6.5%, respectively. In the maintenance plan formulation scenario, CMEOM-LLM surpasses ChatGLM with improvements of 6.6%, 6.5%, and 8.5% in ROUGE-L, BLEU-4, and human evaluation, respectively. In the preventive maintenance scenario, CMEOM-LLM achieves improvements of 7.1% and 8.9% over Qwen in ROUGE-L and BLEU-4, along with a 0.69-point increase in human evaluation scores. This paper provides an effective approach for knowledge management in coal mining equipment O&M.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ruiyuan Zhang

Xiangang Cao

Hongwei Ma

Journals

Applied Sciences

Actions

Institutions

Chinese Academy of Sciences

Xi'an Institute of Optics and Precision Mechanics

Xi'an University of Science and Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Vertical LLM for Coal Mining Equipment O&M Under Limited Fine-Tuning Data

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study