What question did this study set out to answer?

The aim is to address the challenges of multi-agent trajectory prediction in complex driving environments.

June 14, 2026Open Access

Trajectory prediction in heterogeneous environments: integrating multi-scale geometric perception and unbiased interaction modeling

Key Points

The aim is to address the challenges of multi-agent trajectory prediction in complex driving environments.
Developed the PointNetPlus Transformer (PNPT) framework based on a Transformer encoder-decoder structure.
Integrated a Multi-scale Residual-enhanced Polyline Encoder (MRPE) to extract geometric features and boost scene understanding.
Implemented unbiased local coordinate encoding and an interaction-aware intent query module for efficient modeling.
PNPT model achieves state-of-the-art performance on the Waymo Open Motion Dataset with a minADE of 0.5683 and a minFDE of 1.1824.
Miss rate recorded at 11.43% and an mAP of 47.21%, outperforming previous baseline models.
The effectiveness of each proposed framework module was confirmed through extensive ablation studies.

Abstract

Multi-agent trajectory prediction is considered critical for safe and efficient autonomous driving. However, this task remains highly challenging. The difficulties arise from three aspects: the complex dynamic behaviors of heterogeneous agents, the influence of static road semantics, and the intricate dynamic couplings between agents and their environments. To tackle this, we propose the PointNetPlus Transformer (PNPT) framework, which is built upon a Transformer encoder-decoder structure. First, a Multi-scale Residual-enhanced Polyline Encoder (MRPE) is integrated to extract multi-scale local geometric features of scene context and boost semantic scene understanding. Second, unbiased local coordinate encoding and query-guided attention are adopted to improve modeling efficiency and capture local spatial correlations. Third, an interaction-aware intent query module is designed to enhance multi-modal generation and multi-agent interaction modeling. Traditional single-agent trajectory prediction methods have two critical limitations: insufficient interaction modeling among agents and limited multi-modal generation capabilities. These limitations hinder their performance in multi-agent collaborative scenarios. In contrast, through the proposed designs, the accuracy and reliability of multi-agent multi-modal trajectory prediction are significantly improved while efficiency is ensured. On the Waymo Open Motion Dataset (WOMD), our PNPT model achieves state-of-the-art performance with a minADE of 0.5683, a minFDE of 1.1824, a miss rate of 11.43%, and an mAP of 47.21%, outperforming strong baselines. The effectiveness of each module is verified through extensive ablation studies.

AI에게 질문

Bookmark

View Full Paper