In this work, we study stochastic non-cooperative games, where only noisy black-box function evaluations are available to estimate the cost function for each player. Since each player’s cost function depends on both its own decision variables and its rivals’ decision variables, local information needs to be exchanged through a center/network in most existing work for seeking the Nash equilibrium (NE). We propose a new stochastic distributed learning algorithm that does not require communications among players. The proposed algorithm uses simultaneous perturbation method to estimate the gradient of each cost function, and uses mirror descent method to search for the NE. We provide asymptotic analysis for the bias and variance of gradient estimates, and show the proposed algorithm converges to the NE in mean square for the class of strictly monotone games at the optimal rate. The effectiveness of the proposed method is buttressed in a numerical experiment.
He et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: