Policy Gradient Optimzation for Bayesian-Risk MDPs with General Convex Losses | Synapse