June 16, 2024

RoMa: Robust Dense Feature Matching

Key Points

Key points are not available for this paper at this time.

Abstract

Feature matching is an important computer vision task that involves estimating correspondences between two images of a 3D scene, and dense methods estimate all such correspondences. The aim is to learn a robust model, i.e., a model able to match under challenging real-world changes. In this work, we propose such a model, leveraging frozen pretrained features from the foundation model DINOv2. Al-though these features are significantly more robust than local features trained from scratch, they are inherently coarse. We therefore combine them with specialized ConvNet fine features, creating a precisely localizable feature pyramid. To further improve robustness, we propose a tailored transformer match decoder that predicts anchor probabilities, which enables it to express multimodality. Finally, we propose an improved loss formulation through regression-by-classification with subsequent robust regression. We conduct a comprehensive set of experiments that show that our method, RoMa, achieves significant gains, setting a new state-of-the-art. In particular, we achieve a 36% improvement on the extremely challenging WxBS benchmark. Code is provided at github.com/Parskatt/RoMa.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Johan Edstedt

World Vision

Qiyu Sun

East China University of Science and Technology

Georg Bökman

KTH Royal Institute of Technology

Actions

Institutions

Chalmers University of Technology

East China University of Science and Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

RoMa: Robust Dense Feature Matching

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study