Reinforcement learning pulses for transmon qubit entangling gates | Synapse