What question did this study set out to answer?

March 14, 2026Open Access

A Review on the Applications of Transformer Models in Autonomous Driving Perception Systems

Key Points

This review aims to explore the applications of Transformer models in autonomous driving perception systems.
Conducted a literature review on Transformer applications in driving systems.
Analyzed the advantages and challenges of using Transformer models in this field.
Discussed future research directions based on current limitations.
Transformers have improved performance in multimodal information fusion tasks.
They enhance bird's-eye view feature generation for better situational awareness.
Transformers offer improved long-term dependency handling over CNNs and RNNs.

Abstract

With the rapid development of artificial intelligence technologies, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have played a crucial role in perception and trajectory prediction tasks within intelligent driving systems. However, their limitations in global modeling and long-term dependency handling restrict their applicability in complex and dynamic environments. In recent years, the Transformer model, empowered by the attention mechanism, has achieved breakthroughs in natural language processing and has gradually been introduced into autonomous driving. It is now widely applied in multimodal information fusion, bird's-eye view (BEV) feature generation, and trajectory prediction. This paper reviews the major applications of Transformers in autonomous driving, summarizes their advantages and existing challenges, and discusses future research directions.

Bookmark

View Full Paper

Bookmark

View Full Paper

A Review on the Applications of Transformer Models in Autonomous Driving Perception Systems

Key Points

Abstract

Cite This Study