What question did this study set out to answer?

The study aims to enhance load forecasting in power systems by addressing limitations of traditional statistical methods.

March 10, 2026Open Access

Research on Load Forecasting of Complex Power Systems Based on LLM of Dynamic Knowledge Distillation

Key Points

The study aims to enhance load forecasting in power systems by addressing limitations of traditional statistical methods.
Developed a forecasting framework incorporating a large language model (LLM) and dynamic knowledge distillation.
Implemented a cross-attention feature fusion module for integrating historical load data with contextual variables.
Fine-tuned a pretrained GPT-2 model to capture temporal dependencies and act as a teacher model for knowledge distillation.
Introduced a lightweight student transformer model to reduce computational costs during training.
Achieved higher forecasting accuracy than traditional methods and state-of-the-art models.
The distilled student network showed significant reductions in computational load.
The framework is suitable for real-time applications, enhancing operational efficiency in power systems.

Abstract

ABSTRACT Power system load forecasting is essential for modern power grids, as it directly influences operational efficiency, resource scheduling and energy management. Traditional forecasting approaches, which rely on statistical analysis and handcrafted mathematical models, often struggle to capture the nonlinear, high‐dimensional, and dynamically evolving patterns exhibited in real‐world load data. To address these limitations, this study proposes a forecasting framework that incorporates a large language model (LLM) enhanced by a dynamic knowledge distillation mechanism. The framework first employs a cross‐attention–based feature fusion module to integrate historical load data with auxiliary contextual variables. A pretrained GPT‐2 model is then fine‐tuned to extract temporal dependencies and serve as the teacher network. To reduce computational cost and improve deployability, a dynamic knowledge distillation strategy is introduced to guide a lightweight student transformer model during training. Experimental results demonstrate that the proposed method achieves superior forecasting accuracy compared with representative state‐of‐the‐art models, while the distilled student network significantly reduces computational load, making the approach suitable for practical real‐time applications.

Research on Load Forecasting of Complex Power Systems Based on LLM of Dynamic Knowledge Distillation

Key Points

Abstract

Cite This Study