What type of study is this?

This is a Quantitative Study study.

October 13, 2025Open Access

Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update

Key Points

The proposed algorithm reduces per-round computational cost from O(t log T) to O(1), significantly improving efficiency.
Utilizing the online mirror descent framework, the method updates using only current data without storing historical data.
It achieves near-optimal and variance-aware regret, maintaining performance while decreasing resource use.
The adaptive Huber regression approach addresses limitations of previous methods by not relying on specific noise assumptions.

Abstract

We study the stochastic linear bandits with heavy-tailed noise. Two principled strategies for handling heavy-tailed noise, truncation and median-of-means, have been introduced to heavy-tailed bandits. Nonetheless, these methods rely on specific noise assumptions or bandit structures, limiting their applicability to general settings. The recent work Huang et al. 2024 develops a soft truncation method via the adaptive Huber regression to address these limitations. However, their method suffers undesired computational costs: it requires storing all historical data and performing a full pass over these data at each round. In this paper, we propose a one-pass algorithm based on the online mirror descent framework. Our method updates using only current data at each round, reducing the per-round computational cost from O (t T) to O (1) with respect to current round t and the time horizon T, and achieves a near-optimal and variance-aware regret of order O (d T^1-{2 (1+) } ₓ=₁ₓ 䂻ℂ + d T^1-{2 (1+) }) where d is the dimension and ₜ^1+ is the (1+) -th central moment of reward at round t.

Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update

Key Points

Abstract

Cite This Study