February 15, 2024Open Access

ME-ViT: A Single-Load Memory-Efficient FPGA Accelerator for Vision Transformers

Key Points

Key points are not available for this paper at this time.

Abstract

Vision Transformers (ViTs) have emerged as a state-of-the-art solution for object classification tasks. However, their computational demands and high parameter count make them unsuitable for real-time inference, prompting the need for efficient hardware implementations. Existing hardware accelerators for ViTs suffer from frequent off-chip memory access, restricting the achievable throughput by memory bandwidth. In devices with a high compute-to-communication ratio (e. g. , edge FPGAs with limited bandwidth), off-chip memory access imposes a severe bottleneck on overall throughput. This work proposes ME-ViT, a novel Memory Efficient FPGA accelerator for ViT inference that minimizes memory traffic. We propose a single-load policy in designing ME-ViT: model parameters are only loaded once, intermediate results are stored on-chip, and all operations are implemented in a single processing element. To achieve this goal, we design a memory-efficient processing element (ME-PE), which processes multiple key operations of ViT inference on the same architecture through the reuse of multi-purpose buffers. We also integrate the Softmax and LayerNorm functions into the ME-PE, minimizing stalls between matrix multiplications. We evaluate ME-ViT on systolic array sizes of 32 and 16, achieving up to a 9. 22 and 17. 89 overall improvement in memory bandwidth, and a 2. 16 improvement in throughput per DSP for both designs over state-of-the-art ViT accelerators on FPGA. ME-ViT achieves a power efficiency improvement of up to 4. 00 (1. 03) over a GPU (FPGA) baseline. ME-ViT enables up to 5 ME-PE instantiations on a Xilinx Alveo U200, achieving a 5. 10 improvement in throughput over the state-of-the art FPGA baseline, and a 5. 85 (1. 51) improvement in power efficiency over the GPU (FPGA) baseline.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Kyle Marino

Pengmiao Zhang

University of Southern California

Viktor K. Prasanna

University of Southern California

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

ME-ViT: A Single-Load Memory-Efficient FPGA Accelerator for Vision Transformers

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider