The increasing complexity of production systems and the shortage of skilled labor highlight the growing importance of digital services that support machine operators in guided operations and maintenance tasks, such as fault diagnosis and repair. Recent advances in foundation models with sophisticated language and image processing capabilities offer promising new avenues for natural human-machine interaction, improved information retrieval, and effective knowledge management in industrial contexts. However, challenges remain in the integration of domain-specific knowledge into these models, particularly in minimizing hallucinations and ensuring accurate, reliable system behavior. Additionally, general evaluation metrics often fail to capture the nuanced performance of retrieval-augmented generation (RAG) systems in specific industrial domains, calling for rigorous, domain-aware validation approaches. This paper presents the design and evaluation of a RAG system for guided operations and maintenance, developed using documentation from a machine manufacturer. The system is evaluated based on three key criteria: conformance with provided context, completeness of answers in relation to the user query, and response latency. An orchestrated approach combining manual and automated evaluation methods is proposed to assess the individual components of the RAG pipeline, including database design, retrieval quality, contextual prompting, and foundation model selection. Results from the initial prototype demonstrate over 90% conformance and more than 80% answer completeness, validating both the technical feasibility and practical relevance of foundation model-based support systems for this application. The study contributes a novel evaluation approach and provides empirical evidence for the integration of RAG architectures in industrial guided operations and maintenance scenarios.
Wulf et al. (Thu,) studied this question.