This talk will describe two adjacent projects from our team. The first is the application of high-performance computing (HPC) techniques to enable scalable RTL simulation for 10B transistor-scale designs. Our tool, Metro-MPI, exploits the natural boundaries present in chip designs (such as latency-insensitive interfaces) to partition RTL simulations and HPC techniques to extract parallelism. Our implementation of Metro- MPI in OpenPiton+Ariane delivers 2.7 MIPS of RTL simulation throughput for the first time on a design with more than 10 billion transistors and 1,024 Linux-capable cores, simulated on 1000+ physical cores. The second project introduces the problem of hardware decompilation, analysing a low-level artifact (a netlist) in order to recover higher-level programming abstractions, and using those abstractions to generate code written in an HDL. To start attacking this problem, we focus on hardware loop rerolling, identifying repeated logic in netlists (such as would be synthesized from loops in the original HDL code) and rerolling them into syntactic loops in the recovered HDL code. This enables not only faster simulation, but also opens opportunities for transpilation between HDLs, compaction of netlists, understanding/analysis of netlists, and more.
Jonathan Balkind (Thu,) studied this question.