What question did this study set out to answer?

The aim is to enhance monocular 3D lane detection accuracy and efficiency for intelligent driving systems.

March 28, 2026Open Access

Efficient monocular 3D lane detection via Mamba-enhanced CM-3DLane framework

Puntos clave

The aim is to enhance monocular 3D lane detection accuracy and efficiency for intelligent driving systems.
Developed CM-3DLane framework for efficient 3D lane detection.
Introduced the Lane-Aware Mamba block to model long-range spatial dependencies.
Utilized cross-scale attention for effective feature fusion.
Implemented Refined Anchor Dynamic Ranking for optimal 3D anchor representation.
Achieved 58.3 F1 score on OpenLane dataset.
Recorded a 96.5 F1 score on ApolloSim dataset.
Outperformed previous methods while maintaining high computational efficiency.

Resumen

Abstract Monocular 3D lane detection provides richer spatial information than 2D lane detection planar positioning results. It is crucial for enhancing vehicle perception in complex intelligent driving scenarios. Recent advances primarily model lanes in 3D space via anchor lines, project them onto front-viewed (FV) features for sampling, and directly regress 3D coordinates from 2D image features. However, the slender structural attributes of lanes pose significant challenges for accurate localization within 3D space. Existing frameworks struggle with effectively integrating multi-level features to capture global spatial structural relationships essential for detection accuracy and face difficulties in balancing detection performance with computational efficiency. To alleviate these problems, we present a novel CM-3DLane framework, an efficient 3D lane detector. Instead of directly superimposing deeper and lower-level features, we propose a strategy for multi-scale information integration that exploits a convolutional neural network (CNN) backbone for extracting local image features. We propose the Lane-Aware Mamba (LAMamba) block, which employs a tailored 2D selective scan (SS2D) strategy. This enables linear-complexity modeling of long-range spatial dependencies and global lane context, significantly enhancing feature extraction. This is complemented by a Cross-Scale Attention Fusion (CSAF) module that leverages channel and spatial attention mechanisms to effectively fuse multi-scale features. In addition, we design a Refined Anchor Dynamic Ranking (RADR) module to preserve the most representative and informative 3D anchors. CM-3DLane scores 58.3 F1 on OpenLane and 96.5 F1 on ApolloSim, leading all prior methods while maintaining high efficiency suitable for real-time deployment.

Me gusta

Guardar

Ver artículo completo