To address the uncertainties of renewable generation and load demand in islanded microgrids, as well as the value estimation bias inherent in existing Deep Reinforcement Learning (DRL) algorithms, this paper proposes an economic dispatch method based on Bayesian Optimization and Dynamic Weighted TD3 (BO-DW-TD3). First, a continuous control model for a wind-solar-diesel-storage microgrid is constructed with the objective of minimizing operating costs. Second, a Softmax-based dynamic weighting mechanism is introduced into the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm to replace the traditional minimum operation. This mechanism adaptively corrects Q-value bias to overcome policy conservatism. Furthermore, a Bayesian Optimization framework is integrated to achieve global adaptive searching for key hyperparameters. Simulations based on full-year data demonstrate that the proposed method effectively achieves peak shaving and valley filling while substituting fuel consumption, reducing the total annual operating cost to 318,400 CNY. Compared with the classic DQN algorithm and the similar improved PER-TD3 algorithm, the total operating cost is reduced by 52.7% and 33.0%, respectively, demonstrating superior robustness and policy stability.
Cheng et al. (Thu,) studied this question.