What type of study is this?

This is a Human Clinical Trial study.

September 10, 2025

Towards Trustworthy and Effective AI for Academic Policy Navigation: Human Evaluation of a Source-Aware, Domain-Optimized RAG-Based Chatbot

Key Points

AI-assisted tools improve navigation of academic policy, supporting students and staff effectively.
The implementation of trust-building interventions resulted in a high faithfulness score of 0.9597 compared to baseline responses.
Participants reported a clarity satisfaction score of 3.60 out of 4.0, highlighting the system's user-centric design.
Targeted design interventions can significantly enhance trust and effectiveness, paving the way for reliable AI in academia.

Abstract

Abstract Navigating institutional policies remains a challenge for students and staff due to complex legalistic language, hierarchical structures, and dispersed documentation. While Large Language Models (LLMs) such as GPT-4o offer fluent natural language capabilities, their susceptibility to hallucination limits their perceived trustworthiness in academic contexts where factual accuracy and traceability are critical. This study investigates how a combination of transparency-enhancing tactics—specifically, source citation and human-centered evaluation—and domain-specific performance strategies can support the development of more trustworthy and effective AI systems. We present a source-aware, Retrieval-Augmented Generation (RAG)-based chatbot designed to assist users in interpreting Bournemouth University’s Code of Practice for Research Degrees. The system integrates trust-building interventions with performance-enhancing techniques tailored to policy documents, including layout-aware chunking, hybrid self-reranking, and semantic vector search using Pinecone. Quantitative evaluation using the RAGAS framework and BERTScore shows a high faithfulness score (0.9597), outperforming baseline LLM responses. In a pilot user study with doctoral students, participants reported strong satisfaction with clarity (mean score: 3.60/4.0) and source attribution (92% accuracy). While not a complete solution for trustworthy AI, this work demonstrates how targeted design interventions—combining transparency and domain optimization—can enhance both trust and effectiveness in AI-assisted academic policy navigation.

Perguntar à IA

Bookmark

Perguntar à IA

Bookmark

Towards Trustworthy and Effective AI for Academic Policy Navigation: Human Evaluation of a Source-Aware, Domain-Optimized RAG-Based Chatbot

Key Points

Abstract

Cite This Study