Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores | Synapse