This paper presents Mixum, a novel 3D reconstruction framework for Structure-from-Motion (SfM), which combines traditional feature extraction and matching techniques with deep learning-based optimization. The Mixum framework enhances the accuracy of feature matching and eliminates redundant feature points. Additionally, the integration with PixSfM, a deep-learning accuracy optimization algorithm, further reduces reprojection error and enhances multi-view consistency. Experiments on multiple public datasets reveal that Mixum significantly improves 3D reconstruction density and reduces reprojection error by up to 23%, demonstrating its applicability for complex scenes in applications like cultural heritage preservation, virtual reality, and autonomous navigation.
Li et al. (Sun,) studied this question.