Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators | Synapse