August 26, 2024Open Access

ReLExS: Reinforcement Learning Explanations for Stackelberg No-Regret Learners

Key Points

Key points are not available for this paper at this time.

Abstract

With the constraint of a no regret follower, will the players in a two-player Stackelberg game still reach Stackelberg equilibrium? We first show when the follower strategy is either reward-average or transform-reward-average, the two players can always get the Stackelberg Equilibrium. Then, we extend that the players can achieve the Stackelberg equilibrium in the two-player game under the no regret constraint. Also, we show a strict upper bound of the follower's utility difference between with and without no regret constraint. Moreover, in constant-sum two-player Stackelberg games with non-regret action sequences, we ensure the total optimal utility of the game remains also bounded.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Huang et al. (Mon,) studied this question.

synapsesocial.com/papers/68e5b010b6db6435875491d1 https://doi.org/https://doi.org/10.48550/arxiv.2408.14086