What type of study is this?

September 10, 2025

Decentralized Learning in Stochastic Games with Local Information

Key Points

Learning dynamics in games lead to equilibrium convergence under specific conditions.
The study builds on classical equilibrium results, including Kakutani’s fixed-point theorem.
A new satisficing policy revision approach enhances the network structure, allowing for broader application.
Findings reveal necessary and sufficient conditions for convergence from any starting policy profile.

Abstract

In the context of multi-agent systems with decentralized information structures, we study rigorously justified convergence results and associated learning algorithms that converge to equilibria. With this objective in mind, we first review classical equilibrium results, focusing on finite-player games with pure or mixed strategy sets. Results such as Kakutani’s fixed-point theorem and Sion’s minimax theorem establish existence under relatively broad conditions. Building on this background, we then study learning dynamics, including best and better response processes, in which players periodically revise and update strategies to optimize payoffs relative to their previous actions via a policy revision process. This induces a graph on the set of policies which facilitate our mathematical approach which combines graph theory, game theory, stochastic control, and Markov processes. While learning using best/better response dynamics converges under certain conditions reported in Arslan et.al, a new approach to policy revision, termed as satisficing (which may be viewed as a win-stay, lose-shift algorithm), introduced by Yongacoglu et.al provides a strictly richer graph network structure and is applicable to a much broader class of games. In particular, these generalize weakly acyclic games. The question we studied is to precisely characterize the set of games for which such a satisficing process ensures convergence to equilibrium. In particular, we addressed an open question raised by Yongacoglu et al. on necessary and sufficient conditions for convergence to equilibria from any initial policy profile. On sufficiency, we presented a generalization, relaxing requirements to allow multiple pure Nash equilibria, provided at least one is strict and subgame-unique. Our research also presented a nontrivial example of a game that admits a strict pure Nash equilibrium in each induced subgame that fails to converge via satisficing paths, showing that such conditions are insufficient, thus also leading to a necessity condition.

Bookmark

Decentralized Learning in Stochastic Games with Local Information

Key Points

Abstract

Cite This Study

Also Consider

Also Consider