Introduction Cinematic VR (CVR) removes the director’s frame, creating the challenge of guiding audience attention without breaking immersion. This systematic review synthesizes empirical evidence on two audio modalities with strong potential to function as narrative agents—diegetic audio (sounds from within the story world) and object-based spatial audio (discrete sound “objects” rendered with positional metadata)—to clarify how they guide attention, shape affect and presence, and how these effects are validated. Methods Following Preferred Reporting Items for Systematic Reviews and Meta‐Analyses (PRISMA) 2020, searches in IEEE Xplore, ACM Digital Library, Scopus, and Web of Science (last searched June 2025) identified studies using diegetic and/or object‐based spatial audio as narrative devices in CVR with empirical user data; non-diegetic‐only or purely technical papers without user measures were excluded (except where used as baselines). We conducted a qualitative synthesis across behavioral (head/eye tracking), subjective (presence/engagement), and physiological (HR/EMG/EDA/PPG) measures; no protocol registration was performed. Results Eighteen studies met inclusion criteria. Across studies, world-locked, off-screen diegetic cues were repeatedly reported to redirect gaze and shorten time-to-region-of-interest after cuts, while object-based rendering enabled precise, dynamic cue placement and was commonly associated with higher presence/immersion and affective arousal relative to non-spatial or head-locked baselines. Discussion Evidence remains constrained by methodological heterogeneity, small-to-moderate samples, inconsistent reporting, and limited direct measures of narrative comprehension. We contribute (i) a functional taxonomy of CVR narrative audio techniques aligned to diegetic/object-based practice, (ii) a Validation Triangulation Framework integrating behavioral, subjective, and physiological evidence, and (iii) a Minimum Reporting Sharing Standard for CVR Narrative Audio specifying what to report and how to share data/metadata, aligned with PRISMA guidance and Findable, Accessible, Interoperable, Reusable (FAIR) principles.
Perumal et al. (Mon,) studied this question.