Offline Reinforcement Learning for Optimizing Production Bidding Policies | Synapse