What does this research mean for the field?

A two-stage game-theoretic framework integrating distributionally robust optimization and deep reinforcement learning increases virtual power plant daily net profit by up to 18.7% and improves risk management compared to traditional single-stage or deterministic models. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to develop a decision framework that enhances the trading effectiveness of virtual power plants under uncertainty.

May 29, 2026Open Access

Research on the demand response trading strategy of virtual power plants under the two-stage game model

Key Points

This research aims to develop a decision framework that enhances the trading effectiveness of virtual power plants under uncertainty.
Developed a two-stage game-theoretic decision framework addressing internal coordination and external trading.
Utilized distributionally robust optimization for internal cost function construction.
Employed deep reinforcement learning to adaptively model external market interactions.
Achieved an 18.7% increase in average daily net profit compared to single-stage optimization.
Observed a 12.3% profit increase compared to deterministic game models.
Improved the 95% Value-at-Risk by 22.5%, enhancing risk management.

Abstract

Virtual power plants face dual challenges in coordinating internal distributed resources and engaging in strategic external trading under market uncertainties. This study proposes a two-stage game-theoretic decision framework to address these challenges. In the first stage, distributionally robust optimization aggregates photovoltaic systems, energy storage, and flexible loads to construct an internal cost function that incorporates operational risks. In the second stage, this cost function is embedded within a Stackelberg game to model the VPP’s strategic interaction with the external market, where a deep reinforcement learning algorithm adaptively approximates the equilibrium bidding strategy. Simulation results based on real market data from a Chinese provincial spot market and an actual VPP demonstration project demonstrate the effectiveness of the proposed approach. Over a 28-day test period spanning diverse seasonal conditions, the method increases average daily net profit by 18.7% compared to single-stage optimization and by 12.3% compared to deterministic game models, while improving the 95% Value-at-Risk by 22.5%. The findings confirm that integrating distributionally robust optimization for risk-aware internal aggregation with deep reinforcement learning for adaptive external gaming effectively supports VPPs in achieving risk-aware profit maximization within complex market environments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Qi Yu

Lei Nie

Jinxuan Li

Actions

Institutions

China Mobile (China)

China Southern Power Grid (China)

Guangzhou Automobile Group (China)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Research on the demand response trading strategy of virtual power plants under the two-stage game model

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study