Provably Efficient Partially Observable Risk-Sensitive Reinforcement Learning with Hindsight Observation | Synapse