What type of study is this?

September 5, 2025Open Access

MTC-BEV: Semantic-Guided Temporal and Cross-Modal BEV Feature Fusion for 3D Object Detection

Puntos clave

MTC-BEV achieves a nuScenes Detection Score of 72.4%, balancing accuracy and speed effectively.
The framework utilizes the bird's-eye view for fusing image and LiDAR features, enhancing detection performance.
Its temporal fusion handles ego-motion compensation, ensuring consistent and reliable detection over time.
Semantic guidance enhances feature representation by augmenting BEV with segmentation data from 2D images.

Resumen

We propose MTC-BEV, a novel multi-modal 3D object detection framework for autonomous driving that achieves robust and efficient perception by combining spatial, temporal, and semantic cues. MTC-BEV integrates image and LiDAR features in the Bird’s-Eye View (BEV) space, where heterogeneous modalities are aligned and fused through the Bidirectional Cross-Modal Attention Fusion (BCAP) module with positional encodings. To model temporal consistency, the Temporal Fusion (TTFusion) module explicitly compensates for ego-motion and incorporates past BEV features. In addition, a segmentation-guided BEV enhancement projects 2D instance masks into BEV space, highlighting semantically informative regions. Experiments on the nuScenes dataset demonstrate that MTC-BEV achieves a nuScenes Detection Score (NDS) of 72.4% at 14.91 FPS, striking a favorable balance between accuracy and efficiency. These results confirm the effectiveness of the proposed design, highlighting the potential of semantic-guided cross-modal and temporal fusion for robust 3D object detection in autonomous driving.

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo

Cite This Study

Xi et al. (Mon,) studied this question.

synapsesocial.com/papers/68bb5f266d6d5674bcd02fdc https://doi.org/https://doi.org/10.3390/wevj16090493

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo