High renewable penetration in microgrids makes low-carbon economic dispatch under uncertainty challenging, and single-agent deep reinforcement learning (DRL) often yields unstable cost–emission trade-offs. This study proposes a dual-agent DRL framework that explicitly balances operational economy and environmental sustainability. A Proximal Policy Optimization (PPO) agent focuses on minimizing operating cost, while a Soft Actor–Critic (SAC) agent targets carbon emission reduction; their actions are combined through an adaptive weighting strategy. The framework is supported by carbon emission flow (CEF) theory, which enables network-level tracing of carbon flows, and a stepped carbon pricing mechanism that internalizes dynamic carbon costs. Demand response (DR) is incorporated to enhance operational flexibility. The dispatch problem is formulated as a Markov Decision Process, allowing the dual-agent system to learn policies through interaction with the environment. Case studies on a modified PJM 5-bus test system show that, compared with a Deep Deterministic Policy Gradient (DDPG) baseline, the proposed method reduces total operating cost, carbon emissions, and wind curtailment by 16.8%, 11.3%, and 15.2%, respectively. These results demonstrate that the proposed framework is an effective solution for economical and low-carbon operation in renewable-rich power systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Wenjun Qiu
Nanjing University of Information Science and Technology
Hebin Ruan
Xiaoxiao Yu
Australian Regenerative Medicine Institute
Energies
Nanjing University of Science and Technology
Nanjing University of Information Science and Technology
Global Energy Interconnection Research Institute North America
Building similarity graph...
Analyzing shared references across papers
Loading...
Qiu et al. (Thu,) studied this question.
synapsesocial.com/papers/6975b36bfeba4585c2d6ee70 — DOI: https://doi.org/10.3390/en19020551