Offline RLHF Methods Need More Accurate Supervision Signals | Synapse