Token-level Direct Preference Optimization | Synapse