Los puntos clave no están disponibles para este artículo en este momento.
Transformer is a machine learning model based on attention mechanism, which is widely used. When the Transformer model was first proposed, it gradually developed many variants and was promoted and applied in many fields, becoming an important research part in the areas of deep learning. However, the critical attention mechanism of Transformers has issues such as square complexity that affect computational speed and data processing efficiency. In order to meet the needs of data processing and related computing, there have been endless efforts to improve the attention mechanism in Transformers in different work areas. This article mainly provides a brief overview of the recent research progress on the attention mechanism in Transformers. Select representative studies from several directions of attention improvement work to introduce, in order to explore the latest research trends in its improvement work and lay a foundation for pointing out potential research directions for future research work and further improving the performance of Transformers.
Chen et al. (Fri,) studied this question.