What type of study is this?

This is a Quantitative Study study.

October 13, 2025Open Access

D²HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

Key Points

D$^2$HScore improves hallucination detection by measuring semantic breadth and depth in large language models.
The framework incorporates intra-layer dispersion and inter-layer drift to analyze token representations.
By analyzing five open-source large language models across various benchmarks, D$^2$HScore demonstrates consistent enhancements.
This model architecture approach provides lightweight, interpretable detection without the need for prior training.

Abstract

Although large Language Models (LLMs) have achieved remarkable success, their practical application is often hindered by the generation of non-factual content, which is called "hallucination". Ensuring the reliability of LLMs' outputs is a critical challenge, particularly in high-stakes domains such as finance, security, and healthcare. In this work, we revisit hallucination detection from the perspective of model architecture and generation dynamics. Leveraging the multi-layer structure and autoregressive decoding process of LLMs, we decompose hallucination signals into two complementary dimensions: the semantic breadth of token representations within each layer, and the semantic depth of core concepts as they evolve across layers. Based on this insight, we propose D²HScore (Dispersion and Drift-based Hallucination Score), a training-free and label-free framework that jointly measures: (1) Intra-Layer Dispersion, which quantifies the semantic diversity of token representations within each layer; and (2) Inter-Layer Drift, which tracks the progressive transformation of key token representations across layers. To ensure drift reflects the evolution of meaningful semantics rather than noisy or redundant tokens, we guide token selection using attention signals. By capturing both the horizontal and vertical dynamics of representation during inference, D²HScore provides an interpretable and lightweight proxy for hallucination detection. Extensive experiments across five open-source LLMs and five widely used benchmarks demonstrate that D²HScore consistently outperforms existing training-free baselines.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Ding et al. (Mon,) studied this question.

synapsesocial.com/papers/68ecfebf950606aabec0951f — DOI: https://doi.org/10.48550/arxiv.2509.11569

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Cost-Effective Hallucination Detection for LLMs· 2024
Detecting Hallucination in Large Language Models Through Deep Internal Representation Analysis· 2025 · 1 citations
Beyond ROUGE: N-Gram Subspace Features for LLM Hallucination Detection· 2025
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models· 2024 · 4 citations
HARP: Hallucination Detection via Reasoning Subspace Projection· 2025

Authors

Yue Ding

Hunan University

Xiaofan Zhu

Chinese Academy of Medical Sciences & Peking Union Medical College

Tianze Xia

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

D²HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Also consider