Lightweight local feature extractors are essential for real-time SLAM. However, they frequently struggle with perceptual aliasing and low localization accuracy in texture-sparse or repetitive environments. This paper introduces the Geometric-Contextual Alignment Network (GCA-Net), a framework designed to address these instabilities through a Geometric-Contextual Alignment (GCA) module. The proposed GCA module integrates global contextual priors into the feature stream. By employing Context-based Feature-wise Linear Modulation (C-FiLM), the network mitigates perceptual aliasing by prioritizing structurally reliable regions. To enhance spatial precision, we incorporate a Depthwise Separable Atrous Spatial Pyramid Pooling (DS-ASPP) stage to expand the Effective Receptive Field (ERF). This design provides robust multi-scale anchoring, which significantly reduces localization jitter under large viewpoint shifts. Extensive evaluations on MegaDepth, ScanNet and HPatches demonstrate that GCA-Net achieves high sub-pixel precision and robust cross-domain generalization. On the MegaDepth benchmark, GCA-Net outperforms the vanilla XFeat by 8.0% in Area Under the Curve (AUC) at a 5° threshold (AUC@5°). Furthermore, it yields a 23.6% relative improvement over the SuperPoint baseline while using compact 64-floating-point (64-f) descriptors. These results indicate that the GCA mechanism helps capture complex spatial structures that typically require much heavier architectures. By balancing matching accuracy with computational efficiency, GCA-Net provides an effective framework for autonomous navigation on edge computing platforms.
Deng et al. (Mon,) studied this question.