Large language models are increasingly being used to assist with digital forensics and incident response—summarizing artifacts, suggesting hypotheses, drafting timelines, and proposing remediation steps. The failure mode that practitioners have not adequately named or categorized is hallucination in the forensic context: the production of confident, coherent, and forensically plausible narratives that are not supported by the evidence provided. Unlike hallucination in general-purpose LLM use, forensic hallucination carries specific professional and legal consequences: fabricated evidence cited in incident reports, unfounded attribution claims, and remediation recommendations that contaminate the evidence environment. This technical note provides four artifacts for DFIR teams integrating LLMs into their workflows: a reconstruction of the unsafe prompt pattern that produces forensic hallucination; a five-category DFIR hallucination taxonomy with labeled failure patterns for use in eval design and guardrail specification; a cross-model failure sketch describing the hallucination profiles of different frontier model types on identical unsafe DFIR inputs; and a forensic-safe prompt template that constrains LLM output to evidence-bound analysis with explicit uncertainty handling.
Narnaiezzsshaa Truong (Fri,) studied this question.