Key points are not available for this paper at this time.
With the rapid development of autonomous driving technology, multi-agent trajectory prediction has become the core foundation of autonomous driving algorithms. Efficiently and accurately predicting the future trajectories of multiple agents is key to evaluating the reliability and safety of autonomous driving vehicles. Recently, numerous studies have focused on capturing agent interactions in complex traffic scenarios. While most methods adopt agent-centric scene construction, they often rely on fixed scene sizes and incur significant computational overhead. Based on this, we propose the multimodal transformer graph convolution neural network (MTGNet) framework. The MTGNet framework can not only construct a panoramic, fully connected dynamic traffic map for agents but also dynamically adjust the size of traffic scenes. Additionally, it enables accurate and efficient multi-modal multi-agent trajectory prediction. In addition, we utilize the graph convolutional neural network (GCN) to process graph-structured data. This approach not only captures global relationships but also enhances the focus on local features within the scene, thereby improving the model’s sensitivity to local information. Our framework was tested on the Argoverse 2.0 dataset and compared with nine state-of-the-art vehicle trajectory prediction methods, achieving the best performance across all three selected metrics.
Dai et al. (Thu,) studied this question.