TSO: Self-Training with Scaled Preference Optimization | Synapse