What does this research mean for the field?

The MEPAM framework, a knowledge graph-enhanced large language model system, significantly outperforms traditional LLMs like GPT-4o in answering inquiries about microbial enzyme production and catalysis by achieving higher accuracy and nearly eliminating hallucinations. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

May 1, 2026Open Access

Decoding enzymatic landscapes: a knowledge graph–enhanced large language model framework for microbial enzyme production and catalysis systems

Key Points

Key points are not available for this paper at this time.

Abstract

Microbial enzyme production and catalysis systems are crucial aspect of biotechnological research. However, building them from trustworthy published experimental data presents a major obstacle for both manual and automated techniques. Here, we introduce MEPAM ( M icrobial E nzyme P roduction and Catalytic A ctivity based on LL M ), a question-answering system designed to accurately address inquiries related to enzyme production and catalytic reactions. Specifically, by training three machine learning models with > 0.98 accuracy, we identified 11,068 high-quality, relevant articles from the Web of Science. Leveraging DeepSeek-V3 with zero-shot learning, we developed an ontology-driven knowledge representation that extracted 12,434 entities and 35,918 relations with 0.78 extraction accuracy and constructed a structured knowledge graph. Compared to few-shot learning and other machine learning methods, our framework achieved significantly higher extraction accuracy. Using this framework, we developed MEPAM based on retrieval-augmented generation and prompt engineering. Finally, using MEPAM, we extracted a comprehensive network involving the expression profiles, precise culture conditions, and substrate preferences for cellulase, demonstrating the strong utility of this tool. Compared with traditional LLMs, particularly GPT-4o, MEPAM exhibited superior performance, achieving significantly higher answer accuracy (0.86 vs. 0.52) and nearly eliminating hallucinations. MEPAM is available at http://180.76.108.212 . This framework provides context-rich, verifiable insights, thus bridging predictive modeling with experimental validation to facilitate the exploration of microbial enzymatic systems.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Tong et al. (Fri,) studied this question.

synapsesocial.com/papers/6a1dc84cd10dad54e1ef5484 https://doi.org/https://doi.org/10.1016/j.abiote.2026.100059

Bookmark

View Full Paper