Learning to Reason without External Rewards | Synapse