What question did this study set out to answer?

This research aims to improve depth estimation in shape from focus by integrating deep learning and traditional methods.

May 13, 2026Open Access

Hybrid Dual Volume Learning for Iterative Fusion and Adaptive Depth Refinement for Shape from Focus

Key Points

This research aims to improve depth estimation in shape from focus by integrating deep learning and traditional methods.
Implemented a dual-branch shape from focus framework combining deep and traditional focus cues.
Utilized a multi-scale encoder-decoder network for generating a deep focus volume.
Introduced a lightweight depth refinement module to enhance depth map quality based on RGB image guidance.
Achieved significant improvements in accuracy of depth maps compared to traditional methods.
The hybrid approach reduced noise sensitivity and improved structural detail retention.
Experimental validation on synthetic and real-world datasets confirmed the accuracy and reliability of the depth maps.

Abstract

Shape from Focus (SFF) estimates scene depth by analyzing focus variations across a sequence of images captured at different focal settings. Traditional SFF methods rely on handcrafted focus operators that preserve local structural details, but they are often sensitive to noise and perform poorly in textureless regions. In contrast, deep learning-based methods are more robust and can exploit semantic and contextual cues, yet they may lose fine structural information due to feature abstraction and spatial downsampling. To address these complementary limitations, we propose a dual-branch SFF framework that integrates deep and traditional focus cues within a unified architecture. The first branch generates a deep focus volume using a multi-scale encoder-decoder network, while the second branch computes a traditional focus volume using a directional dilated Laplacian (DDL) operator to capture structural focus responses. These two volumes are progressively combined through an iterative gated fusion module, producing a more discriminative fused focus representation. From this fused volume, an initial depth map is estimated through a softmax-based slice aggregation strategy. To further improve spatial consistency and reduce residual artifacts, we introduce a lightweight depth refinement module guided by the mean RGB image of the focal stack. This refinement stage enhances boundary quality and improves the overall depth structure. Extensive experiments on synthetic and real-world datasets demonstrate that the proposed framework produces accurate and reliable depth maps.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper