On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization | Synapse