What question did this study set out to answer?

The aim is to enhance local feature extraction for real-time SLAM by addressing perceptual aliasing and localization accuracy issues.

April 1, 2026Open Access

GCA-Net: Geometric-Contextual Alignment Network for Lightweight and Robust Local Feature Extraction in Visual Localization

Key Points

The aim is to enhance local feature extraction for real-time SLAM by addressing perceptual aliasing and localization accuracy issues.
Introduces the Geometric-Contextual Alignment (GCA) module to integrate contextual priors.
Employs Context-based Feature-wise Linear Modulation (C-FiLM) to prioritize reliable feature regions.
Incorporates Depthwise Separable Atrous Spatial Pyramid Pooling (DS-ASPP) to improve spatial precision and expand the Effective Receptive Field (ERF).
Conducts evaluations on datasets including MegaDepth, ScanNet, and HPatches.
GCA-Net achieves an 8.0% improvement in Area Under the Curve (AUC) at a 5° threshold over vanilla XFeat.
Shows a 23.6% relative enhancement compared to the SuperPoint baseline using compact descriptors.
Provides high sub-pixel precision and robust cross-domain generalization.

Abstract

Lightweight local feature extractors are essential for real-time SLAM. However, they frequently struggle with perceptual aliasing and low localization accuracy in texture-sparse or repetitive environments. This paper introduces the Geometric-Contextual Alignment Network (GCA-Net), a framework designed to address these instabilities through a Geometric-Contextual Alignment (GCA) module. The proposed GCA module integrates global contextual priors into the feature stream. By employing Context-based Feature-wise Linear Modulation (C-FiLM), the network mitigates perceptual aliasing by prioritizing structurally reliable regions. To enhance spatial precision, we incorporate a Depthwise Separable Atrous Spatial Pyramid Pooling (DS-ASPP) stage to expand the Effective Receptive Field (ERF). This design provides robust multi-scale anchoring, which significantly reduces localization jitter under large viewpoint shifts. Extensive evaluations on MegaDepth, ScanNet and HPatches demonstrate that GCA-Net achieves high sub-pixel precision and robust cross-domain generalization. On the MegaDepth benchmark, GCA-Net outperforms the vanilla XFeat by 8.0% in Area Under the Curve (AUC) at a 5° threshold (AUC@5°). Furthermore, it yields a 23.6% relative improvement over the SuperPoint baseline while using compact 64-floating-point (64-f) descriptors. These results indicate that the GCA mechanism helps capture complex spatial structures that typically require much heavier architectures. By balancing matching accuracy with computational efficiency, GCA-Net provides an effective framework for autonomous navigation on edge computing platforms.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Deng et al. (Mon,) studied this question.

synapsesocial.com/papers/69ccb75916edfba7beb893e6 https://doi.org/https://doi.org/10.3390/app16073330

Bookmark

View Full Paper