The key challenge in the intelligent recommendation of multimodal data within think tank systems lies in addressing users’ cross-modal and diverse data requirements. A deep semantic space model based on VGGNet and LSTM learns semantic features from different modalities and fuses them through correlation mapping, thereby enabling intelligent recommendation of multimodal data that satisfies users’ cross-modal and diverse needs. To address this challenge, this paper proposes a hierarchical intelligent recommendation method for multimodal data in think tank systems, grounded in deep semantic learning. A target dataset for multimodal data recommendation is first constructed, and an improved DN-CBR model is developed by integrating VGGNet, LSTM, attention mechanisms, graph neural networks, and a multilayer perceptron (MLP). Specifically, VGGNet and LSTM extract deep semantic features from images and text, which are then aligned and fused into an isomorphic semantic space. Attention mechanisms and graph neural networks capture collaborative features between users and multimodal data, and these features are subsequently integrated with the fused semantic features to form high-level representations. The resulting features are input into the MLP to generate a score reflecting the degree of match between user needs and multimodal data, based on which a hierarchical recommendation list is produced. Experimental results demonstrate that the proposed method achieves optimal recommendation performance when the list length is set to 15 and is capable of recommending hierarchical multimodal data that meet the actual needs of different users. The recommendation performance is stable and reliable, and the method effectively satisfies users’ cross-modal and diverse data requirements.
Xie et al. (Wed,) studied this question.