Proximal Point Nash Learning from Human Feedback | Synapse