RLP: Reinforcement as a Pretraining Objective | Synapse