• The improved Bi-LSTM with a multi-head attention mechanism model is proposed for STLF. • To enhance training efficiency and accuracy, Bayesian optimization is applied to the hyperparameter space. • For the proposed STLF method, the prediction error has been decreased by almost 40% compared with three benchmark models. Short-term load forecasting (STLF) underpins the operation and scheduling of the new “source-grid-load-storage” architecture of the power system, with increasing demand for accurate predictions of fluctuating loads. To achieve precise STLF, this study proposes a Bayesian optimized bi-directional long short-term memory (Bi-LSTM) with a multi-head attention mechanism (BO-Bi-LSTM-MHA). First, multidimensional input data and meteorological factors are preprocessed and fed into the Bi-LSTM layer for bidirectional recurrent training. The MHA layer then computes attention across the forward and backward hidden states, allowing the model to identify and emphasize the most critical moments and feature combinations for the prediction across multiple subspaces, thereby assigning accurate weights. Leveraging the network ability of the Bayesian to examine and optimize solutions, the key hyperparameters of the Bi-LSTM-MHA hybrid architecture are jointly tuned within a Bayesian optimization framework. This ensures synergy between the two components, thereby improving training speed and accuracy. The forecasting performance of the proposed model is evaluated against three benchmark models over four quarters. Results show that it outperforms the others in training efficiency and prediction accuracy, thereby capturing complex load sequences influenced by weather. Prediction error is reduced by approximately 40%, while the R 2 value remains high at 0.96, demonstrating the superiority of the model over existing approaches.
Jiang et al. (Wed,) studied this question.