Offline Actor-Critic Reinforcement Learning Scales to Large Models | Synapse