What type of study is this?

September 10, 2025Open Access

Vision-Augmented RAG System for Interactive Local Heritage Exploration

Key Points

The system enhances cultural heritage exploration by integrating a multimodal conversational assistant.
Results show the object detection model achieves a mean Average Precision of 0.995, ensuring accurate recognition.
The RAG system demonstrates impressive metrics, including 0.96 in context recall and 0.88 in factual correctness.
Combining vision and language models may revolutionize tourism by delivering engaging, reliable user experiences.

Abstract

This paper presents a multimodal conversational assistant deployed as a web application to enhance the exploration of cultural heritage sites. The system enables users to query contextually relevant tourism information through both text and images, offering a seamless and interactive experience. At its core, a custom Retrieval-Augmented Generation (RAG) architecture retrieves relevant knowledge from a vector database and conditions a Large Language Model (LLM) to generate accurate and coherent responses. For visual queries, the assistant leverages a You Only Look Once (YOLO) object detection model to recognize monuments from user-uploaded images, providing concise descriptions and supporting follow-up conversations grounded in visual context. The object detection model achieves a high mean Average Precision (mAP@50) of 0.995, while the RAG pipeline demonstrates strong performance with 0.96 in context recall, 0.98 in faithfulness, and 0.88 in factual correctness. This work highlights the potential of combining vision and language models to deliver reliable and engaging support for culturally informed tourism.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ritu Ram Ojha

Actions

Institutions

Tribhuvan University

Institute of Engineering

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Vision-Augmented RAG System for Interactive Local Heritage Exploration

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study