This article investigates retrieval-based localization in the context of heritage documentation, leveraging an extensive image dataset collected during the restoration of Notre-Dame de Paris. To address the challenges of image retrieval for localization, we first review state-of-the-art approaches, from all-in-one trained models to purely visual methods via 3D-based pipelines. Next, we evaluate various retrieval strategies, comparing purely visual approaches with spatial methods that prioritize spatial relationships between image locations. Finally, we present CIR4Loc, a retrieval framework based on the composed image retrieval (CIR) paradigm, introducing textual modifiers to refine retrieval towards configurations that enhance localization. By bridging the gap between visual and spatial retrieval, this approach ensures the selection of images that are both visually relevant and spatially distributed to improve pose estimation. We demonstrate the effectiveness of this proposal in a real-world heritage context, specifically the scientific site related to the restoration of Notre-Dame de Paris, emphasizing the necessity of retrieval strategies explicitly tailored for spatially aware localization.
Blettery et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: