This study presents early experimentation of a modular AI framework for multimodal drug discovery that integrates natural product based therapies with modern pharmaceuticals. The system combines structured biomedical data, knowledge graphs, and largea language models (LLMs) to generate explicit natural language hypotheses. The architecture has four phases: data aggregation, hypothesis generation, dynamic simulation, and in silico evaluation, and supports dual inference routes (compound → gene → disease/phenotype and disease/phenotype → gene → compound). As a case study, Phases 1 and 2 were applied to the Kampo formula Shakuyaku-kanzo-to, a typical example of a multicomponent and multi-target natural therapy. The framework originally arose from challenges in conventional filtering, where important but poorly annotated compounds were often overlooked. However, the focus has since shifted beyond filtering, toward uncovering hidden relationships across fragmented biomedical knowledge. This early implementation demonstrates the potential of natural language hypothesis generation to restructure fragmented knowledge into interpretable insights, providing a blueprint for future multimodal drug discovery.
Wu et al. (Sun,) studied this question.