Key points are not available for this paper at this time.
Most existing methods for knowledge graph representation learning primarily focus on structured information and overlook the potential benefits of incorporating multimodal information 14. Moreover, relying solely on structural triples for learning poses challenges such as insufficient feature semantics, significant gaps between related entities, and low similarity 13. To address these issues, this paper proposes a multi-modal knowledge graph representation learning method based on hyperplane embedding. Firstly, a graph neural network is used as the structural encoder to learn entity embeddings, while a pre-trained visual model is employed as the image encoder to learn image embeddings. Next, the relationships in each triple are mapped onto hyperplanes, and the entity and visual representations are projected onto the hyperplanes of the relationships to address the multi-relationship data problem. Finally, a cross-translation distance function is utilized to evaluate the probab-ility of the authenticity of each triple and perform link prediction tasks. Experimental results demonstrate the superiority of this approach, with a 0.87% improvement in Hits@10 on the WN18-IMG dataset and a 14.5% improvement on the FB15K-IMG dataset compared to similar models.
Zhang et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: