Deploying Vision Transformers (ViTs) on edge devices poses significant challenges due to their high computational demands and memory access overheads, which severely hinder real-time inference efficiency. This paper proposes a modular and adaptive ViT acceleration architecture targeting the AMD Versal ACAP platform. By leveraging heterogeneous resource collaboration and fine-grained dataflow optimizations, the proposed design addresses performance bottlenecks effectively. We introduce a resource-efficient attention computation module that localizes self-attention operations within AI Engine (AIE) core clusters, thereby reducing inter-module communication and minimizing MAC resource usage. In parallel, a resource-aware multi-stage pipeline scheduling strategy dynamically partitions and parallelizes the computation-intensive feed-forward network (FFN), improving computation reuse and module-level coordination. The architecture integrates parameter tiling and a PLIO-based broadcasting mechanism to construct a decoupled compute-communication dataflow engine, alleviating memory bottlenecks. Experimental results on the Xilinx VCK5000 ACAP platform demonstrate that the proposed design achieves 33.2 TOPS throughput at INT8 precision—outperforming the state-of-the-art EQ-ViT accelerator by 27%—while maintaining a competitive efficiency of 510.6 GOPS/W. Scalability evaluations on ViT-Base and DeiT-Tiny confirm the design’s adaptability in edge scenarios, offering a resource-efficient and reconfigurable hardware paradigm for high-density Transformer inference.
Building similarity graph...
Analyzing shared references across papers
Loading...
Wenbo Zhang
South China Agricultural University
Yan Zhang
University of Vermont
Yiqi Liu
Beijing University of Technology
ACM Transactions on Reconfigurable Technology and Systems
Beijing University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Thu,) studied this question.
synapsesocial.com/papers/69401b1e2d562116f28f7750 — DOI: https://doi.org/10.1145/3779444
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: