Optimizing sparse general matrix–matrix multiplication for DCUs | Synapse