First-price auctions have become common in online display advertising, but they create a practical problem: advertisers pay what they bid while often seeing only whether they won or lost. This opacity can lead to overpayment or missed impressions, especially when market prices change throughout the day. We develop two Bayesian multiarmed bandit methods that help advertisers learn from limited auction feedback and adjust bids dynamically. The methods use auction structure—if one bid wins, higher bids would also have won—to infer market prices more efficiently and adapt to nonstationary bidding environments. Evidence from simulations, offline market logs, online replay, and large-scale A/B tests on a major Chinese advertising platform shows that these methods reduce advertising costs while preserving winning rates. For practitioners, the approach offers an implementable way to automate bid shading, improve return on investment, and decide when paid market price signals are worth acquiring. For platforms and policymakers, the findings highlight how feedback design and price transparency affect advertiser efficiency in first-price auction markets.
Guo et al. (Mon,) studied this question.