We introduce TriORU²-Net++, a novel three-stage architecture designed to address the persistent challenge of occlusion removal in light-field (LF) images by leveraging adaptive attention-guided feature integration and progressive hierarchical reconstruction. Unlike existing methods that struggle to fully exploit spatial hierarchies and adaptively restore occluded regions across scales, our model incorporates a ResASPP-AttFPN feature extractor, which integrates Residual Atrous Spatial Pyramid Pooling (ResASPP) with a spatial attention-enhanced Feature Pyramid Network (AttFPN) to selectively fuse multiscale features while emphasizing salient spatial cues essential for occlusion localization. The core of our framework is a tri-stage U²-Net++ reconstruction module, which performs progressive restoration through three hierarchically connected encoder-decoder stages of decreasing depth (4-level, 3-level, and 2-level), each built on VGG-based blocks and dense skip connections to recover increasingly refined background content. To further enhance detail preservation and structural consistency, we introduce a residual feature refiner (RFR) that consolidates residual cues and sharpens the boundaries of objects. Extensive experimental evaluations demonstrate that the proposed method surpasses recent state-of-the-art (SOTA) LF occlusion removal approaches—representing the most advanced and best-performing techniques reported in the literature—in both quantitative metrics and visual reconstruction quality. Specifically, our model achieves average improvements of 0. 86 dB in PSNR and 0. 016 in SSIM across real-world (CD scene) and synthetic LF datasets, including sparse (4-Syn, 9-Syn) and dense (Single-Occ, Double-Occ) settings. This capability is particularly relevant to the Big Data paradigm, where large-scale visual datasets demand robust preprocessing to remove occlusions and ensure reliable downstream analytics. By improving LF data fidelity while remaining efficient, our model supports scalable pipelines for high-volume visual data processing.
Senussi et al. (Mon,) studied this question.