ORPO: Monolithic Preference Optimization without Reference Model | Synapse