Large-scale dynamic scene reconstruction methods based on spatiotemporal field models demonstrate significant potential in autonomous driving applications. However, existing neural radiance field (NeRF) and 3D Gaussian splatting (3DGS) techniques remain constrained by their dynamic element modeling capabilities and computational efficiency, failing to effectively address complex reconstruction tasks involving intertwined static and dynamic regions in driving scenarios. To address these challenges, this study proposes a novel framework integrating spatiotemporal attention mechanisms with sparse encoding strategies. The method employs a spatiotemporal attention module that captures dynamic motion patterns through self-supervised inter-frame prediction, addressing spatiotemporal inconsistencies caused by non-rigid deformations. Simultaneously, a KL divergence-guided hierarchical sparse encoding strategy achieves efficient multi-scale scene feature representation while preserving reconstruction accuracy. Furthermore, a mean-variance decoupled stochastic sampling mechanism enhances modeling robustness in dynamic regions. Experimental results demonstrate substantial improvements in reconstruction quality compared to state-of-the-art large-scale dynamic scene reconstruction methods, ultimately enabling more photorealistic 3D reconstruction outcomes."
Yinan Qi (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: