Boosting Soft Q-Learning by Bounding | Synapse