October 1, 2024

Empowering Research: Open-Source LLMs, Semantic Search, and Domain-Specific Knowledge in a Multi-Document Q&A Assistant

Key Points

The proposed system achieves over 90% accuracy in responding to specialized inquiries, transforming research access for users.
Key metrics indicate Flan T5 outperforms with a 2.2s response time and 87% truthful outputs.
Combining LLMs with vector databases enhances natural language understanding for better document retrieval.
Insights from comparing two open-source models reveal unique strengths, guiding future improvements in chatbot applications.

Abstract

In the age of information abundance, accessing and extracting knowledge from vast repositories of research documents pose significant challenges for researchers, students, and professionals. Existing chatbot systems, relying on general-purpose Large Language Models (LLMs), often fail to provide accurate responses to domain-specific inquiries. Additionally, the high cost of fine-tuning LLMs for specific domains hinders widespread adoption. To address these limitations, this paper proposes a novel methodology that combines the power of LLMs and vector databases. The system efficiently understands natural language user queries and retrieves relevant information from research documents provided by the user, circumventing the need for extensive and resource consuming fine-tuning. Furthermore, the paper compares 2 open-source models hosted in HuggingFace based on metrics like output accuracy, factuality, output token length, and response time. The proposed approach showed an overall accuracy of more than 90% when comparing expected output and LLM generated output using various text comparison metrics including ROUGE and Semantic Answer Similarity. Our comprehensive evaluation of Falcon and Flan T5's responses unveils distinct strengths. Flan T5 shines with remarkable accuracy exceeding 90%, efficient response time of 2.2s, and a truthful output in 87% of the cases. These insights also contribute to understanding each model's unique area of excellence.

Ask AI

Mark Helpful

Bookmark

Relay