Abstract Objective Drug repurposing is particularly challenging yet essential for rare diseases, where limited patient populations and scarce biomedical evidence hinder traditional discovery pipelines. This work presents a holistic machine learning approach for drug–disease link prediction, leveraging multiple heterogeneous sources including biomedical literature, structured databases, and textual descriptions of diseases. Materials and Methods Focusing on seven rare neuro-muscular disorders, we construct a biomedical knowledge graph from literature and open databases, to evaluate a suite of rule-based, graph neural network, and path-encoding models. An ensemble of the best-performing methods, further enriched with disease similarity features derived from text-based embeddings, is used to generate candidate treatments for each disorder. Results Experimental results show that established graph neural network approaches (CompGCN), and path encoding methods (Prime Adjacency Matrix framework), outperform other approaches in metrics like Mean Reciprocal Rank. The ensemble of the best-performing methods further improves those metrics, reaching MRR = 0.3145. A manual validation of top-ranked drugs from rare disease experts illustrates a high precision (50%) for drugs that potentially treat a rare disorder or its symptoms. Discussion The lack of vast number of publications and known drug indications for rare neuro-muscular disorders sets serious challenges in identifying potential therapies and symptom-relievers. The ensemble predictor incorporates rule-based, graph neural networks and path encoding techniques, to improve drug repurposing prediction performance on a biomedical knowledge graph created from open data. Conclusion Expert evaluation indicates that an ensemble of various knowledge graph link prediction methods can produce promising repurposing hypotheses, for disorders lacking any approved therapies.
Papadimas et al. (Fri,) studied this question.