With the rapid development of artificial intelligence technologies, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have played a crucial role in perception and trajectory prediction tasks within intelligent driving systems. However, their limitations in global modeling and long-term dependency handling restrict their applicability in complex and dynamic environments. In recent years, the Transformer model, empowered by the attention mechanism, has achieved breakthroughs in natural language processing and has gradually been introduced into autonomous driving. It is now widely applied in multimodal information fusion, bird's-eye view (BEV) feature generation, and trajectory prediction. This paper reviews the major applications of Transformers in autonomous driving, summarizes their advantages and existing challenges, and discusses future research directions.
han haoxuan (Wed,) studied this question.