What question did this study set out to answer?

The aim is to improve cross-modal retrieval by preserving neighborhood structural relationships and addressing sample imbalance.

May 27, 2026Open Access

Deep Neighborhood-Similarity Preservation Hashing for Cross-Modal Retrieval

Read Full Paperexternally

Key Points

The aim is to improve cross-modal retrieval by preserving neighborhood structural relationships and addressing sample imbalance.
Developed Deep Neighborhood-similarity Preservation Hashing (DNsPH) method.
Designed Context-aware Cross-layer Bilinear Fusion Network (C2BF-Net) using LSTM for feature extraction.
Implemented a multi-similarity loss to manage sample imbalance during model training.
DNsPH significantly outperformed state-of-the-art cross-modal retrieval methods on MIRFLICKR-25K dataset (p < 0.05).
Achieved better retrieval precision in high-dimensional multi-modal data settings.
Demonstrated effective refinement in generating discriminative hash codes compared to traditional methods.

Abstract

Due to low storage cost and fast query efficiency, cross-modal hashing has attracted considerable interest in multi-modal data retrieval. However, existing hashing methods face several challenges: one major challenge arises from the neglect of both local and non-local neighborhood structural relationships within multi-modal information, which makes it difficult to establish fine-grained semantic consistency associations between heterogeneous modalities. Additionally, the imbalance in the number of training samples limits the improvement of retrieval performance. To address these challenges, a Deep Neighborhood-similarity Preservation Hashing (DNsPH) method is proposed for cross-modal retrieval. To obtain the high-order statistical features of images, we first design a Context-aware Cross-layer Bilinear Fusion Network (C2BF-Net), which uses Long Short-Term Memory (LSTM) to model the context-dependent features of different convolutional layers. Furthermore, the image, text, and semantic labels information are fused through an adaptive weighting strategy to reconstruct the joint semantic similarity matrix to explore the fine-grained neighborhood structure between different modalities. Finally, we introduce a multi-similarity loss based on an adaptive margin to mining and weighting informative sample pairs, to alleviate the impact of sample imbalance on model training, and thereby generate more discriminative hash codes. Extensive experiments performed on the MIRFLICKR-25K and NUS-WIDE datasets demonstrate that DNsPH outperforms state-of-the-art cross-modal retrieval applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

W W Wang

Changchun University of Science and Technology

Lintao Xian

Weifang Medical University

Ziyuan Cui

China University of Petroleum, Beijing

Journals

Computers

Actions

Institutions

Ocean University of China

Changchun University of Science and Technology

Weifang Medical University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Deep Neighborhood-Similarity Preservation Hashing for Cross-Modal Retrieval

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider