Background Large language models (LLMs) are increasingly used in the biomedical field for information retrieval, information extraction and knowledge discovery. However, their potential in retrieving and discovering drug combinations for diseases remains underexplored. Objective This study aims to evaluate the effectiveness of LLMs in retrieving known drug combinations and to identify novel drug combinations for treating Alzheimer's disease (AD). Methods We developed a series of prompts to guide LLMs in retrieving drug combinations. Their performance was evaluated using both FDA-approved combinations and combinations identified through PubMed literature mining. We then assessed the feasibility of identifying novel drug combination candidates for AD. In collaboration with domain experts, we performed pathway enrichment analyses to evaluate their potential mechanisms of action within the context of AD. Results In a comparative evaluation of multiple LLMs, GPT-5 demonstrated the strongest overall performance, achieving an accuracy of 0.95 and a balanced F1 score of 0.95 in identifying FDA-approved drug combinations. Among the top 10 drug-combination candidates for AD treatment suggested by GPT-5, the combination of donepezil and memantine is already FDA-approved. Three other combinations have been tested in AD clinical trials, and three have supporting evidence in the literature. We also identified 10 off-label drug combinations, with pathway enrichment analyses indicating that several target key AD-related biological pathways. Conclusions LLMs is effective in retrieving drug combinations for a given disease and the performance varies among different language models with best performance for GPT-5. However, the suggestions from LLM models require further validation to be considered reliable.
Wang et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: