This study presents the design and implementation of an intelligent service system that integrates generative AI, contextual retrieval, and adaptive interaction mechanisms to support frontline consultation in hospital environments. Motivated by the challenges faced in HIV and infectious-disease counseling—such as limited manpower, varying patient needs, and the demand for timely and standardized information—this research proposes a modular AI service architecture capable of delivering context-aware and empathetic responses. The system combines multi-stage speech interaction, Retrieval-Augmented Generation (RAG), and domain-adapted prompting strategies to support contextually consistent and retrieval-grounded responses in medical consultation scenarios. A distance-sensing trigger module initiates the interaction, while Voice Activity Detection (VAD) and speech-to-text (STT) modules capture user speech input with stability. The RAG module searches a curated medical knowledge base to provide domain-relevant contextual information, and the large language model (GPT-4o-mini) generates contextually grounded and comprehensible responses. The system’s multimodal output—consisting of synchronized text, speech, and animated visual feedback—supports user engagement and interaction fluency. Experimental evaluation conducted in collaboration with a regional hospital demonstrated positive results in language-generation quality and exploratory usability assessment under controlled or semi-controlled evaluation conditions (BLEU-4 = 0.71; ROUGE-L = 0.83; fluency score = 4.6/5). Scenario-based evaluations further suggest that the prototype can support retrieval-grounded consultation workflows and provide traceable interaction records under the evaluated conditions. Overall, the proposed system demonstrates the feasibility of integrating retrieval-grounded language generation, multimodal interaction, and embedded sensing into a human-centered consultation-support framework for healthcare service environments.
Pan et al. (Thu,) studied this question.