Image identification becomes particularly challenging in datasets characterized by high intra-class similarity and minimal structural variation, such as wood textures, security paper, or metal alloys. In these contexts, effective discrimination depends on capturing fine-grained textural cues. We address this with a novel keypoint detector built upon an Equivariant Convolutional Neural Network, trained using a triplet loss function guided by the Structural Similarity Index. This design encourages the extraction of features that are not only equivariant to common transformations but also highly discriminative across visually similar instances. A central contribution of our method is the generation of keypoints with high repeatability - an attribute we show to be closely tied to improved identification accuracy. Through comprehensive experiments, we demonstrate that our approach consistently outperforms state-of-the-art methods in both matching and identification tasks across multiple datasets.
Santos et al. (Tue,) studied this question.