The imminent threat of large-scale quantum computers to modern public-key cryptographic devices has led to extensive research into post-quantum cryptography (PQC). Lattice-based schemes have proven to be the top candidate among existing PQC schemes due to their strong security guarantees, versatility, and relatively efficient operations. However, the computational cost of lattice-based algorithms—including various arithmetic operations such as Number Theoretic Transform (NTT), polynomial multiplication, and sampling—poses considerable performance challenges in practice. This survey offers a comprehensive review of hardware acceleration for lattice-based cryptographic schemes—specifically both the architectural and implementation details of the standardized algorithms in the category CRYSTALS-Kyber, CRYSTALS-Dilithium, and FALCON (Fast Fourier Lattice-Based Compact Signatures over NTRU). It examines optimization measures at various levels, such as algorithmic optimization, arithmetic unit design, memory hierarchy management, and system integration. The paper compares the various performance measures (throughput, latency, area, and power) of Field-Programmable Gate Array (FPGA) and Application-Specific Integrated Circuit (ASIC) implementations. We also address major issues related to implementation, side-channel resistance, resource constraints within IoT (Internet of Things) devices, and the trade-offs between performance and security. Finally, we point out new research opportunities and existing challenges, with implications for hardware accelerator design in the post-quantum cryptographic environment.
Yan et al. (Thu,) studied this question.