Key points are not available for this paper at this time.
Transformer has emerged as a popular deep neural network (DNN) model for Neural Language Processing (NLP) applications and demonstrated excellent performance in neural machine translation, entity recognition, etc. However, its scaled dot-product attention mechanism in auto-regressive decoder brings a performance bottleneck during inference. Transformer is also computationally and memory intensive and demands for a hardware acceleration solution. Although researchers have successfully applied ReRAM-based Processing-in-Memory (PIM) to accelerate convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the unique computation process of the scaled dot-product attention in Transformer makes it difficult to directly apply these designs. Besides, how to handle intermediate results in Matrix-matrix Multiplication (MatMul) and how to design a pipeline at a finer granularity of Transformer remain unsolved. In this work, we propose ReTransformer - a ReRAM-based PIM architecture for Transformer acceleration. ReTransformer can not only accelerate the scaled dot-product attention of Transformer using ReRAM-based PIM but also eliminate some data dependency by avoiding writing the intermediate results using the proposed matrix decomposition technique. Moreover, we propose a new sub-matrix pipeline design for multi-head self-attention. Experimental results show that compared to GPU and Pipelayer, ReTransformer improves computing efficiency by 23.21× and 3.25×, respectively. The corresponding overall power is reduced by 1086× and 2.82×, respectively.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xiaoxuan Yang
Oak Ridge National Laboratory
Bonan Yan
Huaqiao University
Hai Li
Intel (United States)
Duke University
Building similarity graph...
Analyzing shared references across papers
Loading...
Yang et al. (Mon,) studied this question.
synapsesocial.com/papers/6a125a7efb24b1a422a5a758 — DOI: https://doi.org/10.1145/3400302.3415640