What type of study is this?

September 5, 2025Open Access

DAR-MDE: Depth-Attention Refinement for Multi-Scale Monocular Depth Estimation

Key Points

Dar-MDE achieves robust depth estimation, surpassing earlier models in accuracy and efficiency.
The model shows an accuracy of 87.25% with a δ < 1.25 on NYU Depth v2, demonstrating its effectiveness.
Key features include a multi-scale loss based on curvilinear saliency and a focus on structurally important regions.
Evaluation on multiple datasets underscores Dar-MDE's potential for real-world applications in robotics.

Abstract

Monocular Depth Estimation (MDE) remains a challenging problem due to texture ambiguity, occlusion, and scale variation in real-world scenes. While recent deep learning methods have made significant progress, maintaining structural consistency and robustness across diverse environments remains difficult. In this paper, we propose DAR-MDE, a novel framework that combines an autoencoder backbone with a Multi-Scale Feature Aggregation (MSFA) module and a Refining Attention Network (RAN). The MSFA module enables the model to capture geometric details across multiple resolutions, while the RAN enhances depth predictions by attending to structurally important regions guided by depth-feature similarity. We also introduce a multi-scale loss based on curvilinear saliency to improve edge-aware supervision and depth continuity. The proposed model achieves robust and accurate depth estimation across varying object scales, cluttered scenes, and weak-texture regions. We evaluated DAR-MDE on the NYU Depth v2, SUN RGB-D, and Make3D datasets, demonstrating competitive accuracy and real-time inference speeds (19 ms per image) without relying on auxiliary sensors. Our method achieves a δ < 1.25 accuracy of 87.25% and a relative error of 0.113 on NYU Depth v2, outperforming several recent state-of-the-art models. Our approach highlights the potential of lightweight RGB-only depth estimation models for real-world deployment in robotics and scene understanding.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper