Compiling Strassen-like Matrix Multiplication Algorithms to Fast CUDA Kernels | Synapse