Provably Sample Efficient RLHF via Active Preference Optimization | Synapse