Stepwise Alignment for Constrained Language Model Policy Optimization | Synapse