What question did this study set out to answer?

The aim is to evaluate the DepthAnything model's performance on resource-constrained embedded platforms for depth estimation.

April 1, 2026Open Access

Evaluating Monocular Depth Estimation on Embedded Platforms for Autonomous Navigation

Key Points

The aim is to evaluate the DepthAnything model's performance on resource-constrained embedded platforms for depth estimation.
Developed the DepthAnything model on NVIDIA Jetson Orin
Analyzed accuracy and speed trade-offs across various backbones (ViT-S, ViT-B, ViT-L)
Utilized metrics like AbsRel, δ1, RMSE, and FPS on the KITTI dataset
ViT-S backbone achieved the best balance with 44 FPS and high accuracy
ViT-B showed performance degradation
ViT-L faced significant instability due to optimization artifacts

Abstract

Deep learning-based monocular depth estimation has achieved significant advancements on urban benchmarks, but its embedded application remains limited by efficiency constraints. Vision Transformers (ViTs) and Foundation Models (FMs) show promising zero-shot generalization capabilities, yet their adaptation to resource-constrained hardware requires careful study. In this work, we investigate the development of the DepthAnything model on an NVIDIA Jetson Orin, analyzing the trade-off between accuracy and inference speed for different backbones (ViT-S, ViT-B, and ViT-L). We report quantitative metrics including AbsRel, δ1, RMSE, and FPS on the KITTI dataset, along with qualitative results. Our experiments show that the ViT-S backbone offers the best balance of accuracy and real-time performance (44 FPS), whereas ViT-B suffers from degradation and ViT-L exhibits significant instability due to optimization artifacts. These findings highlight the viability of compact backbones for embedded visual perception and suggest future optimizations, such as quantization-aware training and pruning, in larger architectures.

Bookmark

View Full Paper

Bookmark

View Full Paper

Evaluating Monocular Depth Estimation on Embedded Platforms for Autonomous Navigation

Key Points

Abstract

Cite This Study