In the age of information abundance, accessing and extracting knowledge from vast repositories of research documents pose significant challenges for researchers, students, and professionals. Existing chatbot systems, relying on general-purpose Large Language Models (LLMs), often fail to provide accurate responses to domain-specific inquiries. Additionally, the high cost of fine-tuning LLMs for specific domains hinders widespread adoption. To address these limitations, this paper proposes a novel methodology that combines the power of LLMs and vector databases. The system efficiently understands natural language user queries and retrieves relevant information from research documents provided by the user, circumventing the need for extensive and resource consuming fine-tuning. Furthermore, the paper compares 2 open-source models hosted in HuggingFace based on metrics like output accuracy, factuality, output token length, and response time. The proposed approach showed an overall accuracy of more than 90% when comparing expected output and LLM generated output using various text comparison metrics including ROUGE and Semantic Answer Similarity. Our comprehensive evaluation of Falcon and Flan T5's responses unveils distinct strengths. Flan T5 shines with remarkable accuracy exceeding 90%, efficient response time of 2.2s, and a truthful output in 87% of the cases. These insights also contribute to understanding each model's unique area of excellence.
Ghali et al. (Tue,) studied this question.