LIRE: listwise reward enhancement for preference alignment | Synapse