This paper presents the design, implementation and evaluation of an agentic virtual assistant (VA) for a medical clinic, combining large language models (LLMs) with retrieval-augmented generation (RAG) technology and multi-agent artificial intelligence (AI) frameworks to enhance reliability, clinical accuracy and explainability. The assistant has multiple functionalities and is built around an orchestrator architecture in which a central agent dynamically routes user queries to specialized tools for retrieval-augmented question answering (Q&A), document interpretation and appointment scheduling. The implementation combines LangChain and LangGraph with interactive visualizations to track reasoning steps, prompts using Gemini 2.5 Flash defines tool usage and strict formatting rules, maintaining reliability and mitigating hallucinations. Prompt engineering has an important role in the implementation and thus, it is designed to assist the patient in the human–computer interaction. Evaluation through qualitative and quantitative metrics, including ROUGE, BLEU, LLM-as-a-judge and sentiment analysis, confirmed that the multi-agent architecture enhances interpretability, accuracy and context-aware performance. Evaluation shows that the multi-agent architecture improves reliability, interpretability and alignment with medical requirements, supporting diverse clinical tasks. Furthermore, the evaluation shows that Gemini 2.5 Flash combined with clinic-specific RAG significantly improves response quality, grounding and coherence compared with earlier models. SBERT analyses confirm strong semantic alignment across configurations, while LLM-as-a-judge scores highlight the superior relevance and completeness of the 2.5 RAG setup. Although some limitations remain, the updated system provides a more reliable and context-aware solution for clinical question answering.
Tanasă et al. (Mon,) studied this question.