Large Language Models (LLMs) are transforming information access and decision support across domains, yet their application in safety-critical settings remains limited by challenges such as hallucination, lack of domain grounding, and interpretability. To address these issues, Graph Retrieval-Augmented Generation (GraphRAG) has emerged as a novel paradigm that integrates LLMs with knowledge graphs, enabling more coherent, faithful, and context-aware outputs through structured semantic retrieval. In this context, this exploratory study explores the application of GraphRAG to the domain of industrial safety, focusing on incidents involving Lockout/Tagout (LOTO) procedure failures. By integrating a Neo4j-based knowledge graph constructed from a set of accident narratives, extracted from the OSHA database, with the generative capabilities of GPT-4o, we assess the system’s ability to produce coherent, complete, and decision-relevant answers grounded in structured safety data. A total of 150 questions, categorized into six task types, were used to evaluate model performance across six metrics: Coherence, Completeness, Empowerment, Faithfulness, F1 Score, and Relevance. The results highlight GraphRAG’s strengths in tasks aligned with graph structure, particularly Summarization, Classification, and Recommendation, while revealing performance limitations in more cognitively demanding tasks such as Reasoning and Comparison. The evaluation underscores the value of structured semantics in enhancing generation quality but also points to scalability and interpretability challenges.
Salvi et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: