Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence | Synapse