Scalable and modular algorithms for floating-point matrix multiplication on FPGAs | Synapse