Shape from Focus (SFF) estimates scene depth by analyzing focus variations across a sequence of images captured at different focal settings. Traditional SFF methods rely on handcrafted focus operators that preserve local structural details, but they are often sensitive to noise and perform poorly in textureless regions. In contrast, deep learning-based methods are more robust and can exploit semantic and contextual cues, yet they may lose fine structural information due to feature abstraction and spatial downsampling. To address these complementary limitations, we propose a dual-branch SFF framework that integrates deep and traditional focus cues within a unified architecture. The first branch generates a deep focus volume using a multi-scale encoder-decoder network, while the second branch computes a traditional focus volume using a directional dilated Laplacian (DDL) operator to capture structural focus responses. These two volumes are progressively combined through an iterative gated fusion module, producing a more discriminative fused focus representation. From this fused volume, an initial depth map is estimated through a softmax-based slice aggregation strategy. To further improve spatial consistency and reduce residual artifacts, we introduce a lightweight depth refinement module guided by the mean RGB image of the focal stack. This refinement stage enhances boundary quality and improves the overall depth structure. Extensive experiments on synthetic and real-world datasets demonstrate that the proposed framework produces accurate and reliable depth maps.
Ashfaq et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: