March 8, 2024Open Access

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

Key Points

Key points are not available for this paper at this time.

Abstract

We study stochastic approximation procedures for approximately solving a d -dimensional linear fixed-point equation based on observing a trajectory of length n from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the order t₌₈ₗdn on the squared error of the last iterate of a standard scheme, where t₌₈ₗ is a mixing time. We then prove a non-asymptotic instance-dependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters (d, t₌₈ₗ) in the higher-order terms. We complement these upper bounds with a non-asymptotic minimax lower bound that establishes the instance-optimality of the averaged SA estimator. We derive corollaries of these results for policy evaluation with Markov noise—covering the TD () family of algorithms for all [0, 1) —and linear autoregressive models. Our instance-dependent characterizations open the door to the design of fine-grained model selection procedures for hyperparameter tuning (e. g. , choosing the value of when running the TD () algorithm).

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

Key Points

Abstract

Cite This Study

Also Consider

Also Consider