What question did this study set out to answer?

This research aims to enhance monocular depth estimation using a novel unsupervised learning architecture.

January 24, 2026Open Access

Monocular Unsupervised Depth Estimation of Residual Stratification Based on Ordinal Relation Networks

Puntos clave

This research aims to enhance monocular depth estimation using a novel unsupervised learning architecture.
Introduced a model distillation technique with a teacher-student network structure.
Integrated an ordinal module for weight normalization in the decoder of the teacher network.
Developed a residual stratification module for adapting 2D features to 3D depth representation.
Applied the method on the KITTI dataset for experimental validation.
Reduced relative squared error by 2.3% compared to benchmark algorithms.
Achieved a 3.3% decrease in root-mean-square error.
Demonstrated accelerated network training times.

Resumen

ABSTRACT Depth estimation has been widely applied in the field of computer vision, primarily using unsupervised deep neural networks, which often rely on deeper neural networks. However, the addition of layers can result in slower convergence and suboptimal performance. To overcome these issues, we introduce a novel architecture employing model distillation, wherein a teacher network enhances the learning process of a preceding student network. To improve network speed, we integrate an ordinal module in the decoder of the teacher network for weight normalization. This module can classify weights and filter out those with the lowest information content. After the weight classification is completed, as the category value increases, the necessity of useful information decreases accordingly. Furthermore, we incorporate a residual stratification module, which adapts 2D image feature extraction methods to 3D depth, facilitating finer, multi‐scale feature representation, to expand the receptive field size at each layer of the network, thereby enhancing the accuracy and robustness of depth estimation. Experimental results using the publicly available KITTI dataset demonstrate that the proposed method accelerates network training compared to the benchmark algorithm, reducing the relative squared error by 2.3% and the root‐mean‐square error by 3.3%, thus validating the effectiveness of our approach.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo