Abstract Multi-Scalar Multiplication (MSM) is the primary computational bottleneck in zero-knowledge (ZK) proof generation for decentralized networks. This research accelerates MSM by solving the memory bandwidth constraints inherent in high-dimensional elliptic curve cryptography. We introduce Modular Hypercube Chunking, a novel microarchitectural approach that partitions high-dimensional algebraic precomputations into smaller, orthogonal blocks. Specifically, we divide a 12-dimensional workload into three separate 4D hypercubes, restricting the entire memory footprint to 31.1 KB. This geometric partitioning ensures perfect residency within the ultra-fast L1 cache of modern processors. By employing shared doubling across these blocks, the algorithm processes twelve scalars simultaneously with a single elliptic curve duplication, bypassing slow RAM access entirely. Empirical evaluations conducted on an ARM Snapdragon 8 Gen 2 mobile processor demonstrate a peak 5.37× speedup compared to optimized sequential baselines, reducing the computational cost to 18.44 microseconds per scalar. These findings prove that geometric data partitioning within strict L1 cache boundaries significantly outperforms traditional arithmetic-heavy optimizations. The implications of this work provide a highly scalable architecture capable of executing server-grade ZK-Rollup proof generation on resource-constrained edge devices, while establishing a highly efficient blueprint for future multicore hardware accelerators. Furthermore, initial stress-tests of a 12D monolithic architecture (68 MB footprint) yielded an anomalous 8.88× peak speedup. This finding reveals a novel sparse-access memory optimization path, which we introduce as an open architectural challenge.
Andrés Sebastián Pirolo (Sat,) studied this question.