Reinforcement Learning from Diverse Human Preferences | Synapse