Graphene has garnered significant multidisciplinary interest for its exceptional properties and wide-ranging applications in materials science, engineering, physics, energy storage, and electronics. However, integrating the vast and heterogeneous body of knowledge into cohesive interdisciplinary research remains significantly challenging, requiring highly specialized expertise, rigorous experimental design, and efficient literature knowledge retrieval. To address these issues, GrapheneChat was developed as the first fine-tuned large language model (LLM) specifically designed for graphene research. Trained on comprehensive data sets of monographs and scholarly articles, GrapheneChat employs a two-stage strategy of supervised fine-tuning (SFT) and direct preference optimization (DPO) to achieve enhanced domain-specific reasoning and experimental design. By integrating a retrieval-augmented generation (RAG) framework, the model delivers literature-grounded and reference-supported responses for knowledge retrieval. Quantitative evaluations using the newly developed GrapheneBench demonstrate that GrapheneChat achieves an impressive accuracy of 91%, comparable to state-of-the-art models like GPT-4, while requiring fewer computational resources. As an intelligent research assistant, GrapheneChat not only facilitates interdisciplinary innovation but also establishes a paradigm for building domain-specific LLMs that enhance expert productivity in literature mining.
Yang et al. (Fri,) studied this question.