Complex imaging environments and conditions in real-world scenes pose significant challenges for stereo matching tasks. Models are susceptible to underperformance in non-Lambertian surfaces, weakly textured regions, and occluded regions, due to the difficulty in establishing accurate matching relationships between pixels. To alleviate these problems, we propose a multi-scale geometrically enhanced stereo matching model that exploits the geometric structural relationships of the objects in the scene to mitigate these problems. Firstly, a geometric structure perception module is designed to extract geometric information from the reference view. Secondly, a geometric structure-adaptive embedding module is proposed to integrate geometric information with matching similarity information. This module integrates multi-source features dynamically to predict disparity residuals in different regions. Third, a geometric-based normalized disparity correction module is proposed to improve matching robustness for pathological regions in realistic complex scenes. Extensive evaluations on popular benchmarks demonstrate that our method achieves competitive performance against leading approaches. Notably, our model provides robust and accurate predictions in challenging regions containing edges, occlusions, reflective, and non-Lambertian surfaces. Our source code will be publicly available.
Dai et al. (Fri,) studied this question.