Suspense in Shorts This poster presents Suspense in Shorts, a computational literary studies project conducted as an Undergraduate Research Apprentice Program (URAP) project at the University of California, Berkeley (January–May, 2026). The project operationalizes and measures suspense in English-language short fiction (UK/US) by applying four literary suspense theories — Carroll (1997) & Iwata (2009) uncertainty-based suspense (T1), Smuts (2008) desire-frustration theory (T2), Gerrig (1996) partial uncertainty theory (T3), and a theory-neutral casual reader approach (T4) — as LLM annotation prompts. Prior to LLM annotation, a manual annotation of 15 short stories was conducted by a team of four annotators, yielding the expected inter-annotator disagreement reflecting the incompatible theoretical conceptualizations underlying each of the four operationalized theory approaches. Using a local LLM (Qwen3-4B-Instruct, via LM Studio) and the custom-built annotation and evaluation tool SuspenseLens, we annotate reader suspense and character anxiety at the sentence level across a short fiction corpus, with Miriam Allen De Ford's Oh Rats! (1961) as a pilot text. Annotations are validated against manual expert gold standard annotations using theory-specific gold sheets. Evaluation results reveal systematic differences across theories: for reader suspense, T2 achieves the best overall weighted metrics (W. F1 = 0. 717, Spearman r = 0. 569), while T4 best captures the distribution of non-zero suspense annotations (NZ precision = 0. 253, NZ recall = 0. 234). Score inflation (the model assigning non-zero suspense where human annotators rated 0) is a dominant pattern for T1 (inflation rate 0. 870) and T3 (0. 692), with near-zero NZ precision (0. 011 and 0. 029 respectively), indicating the model rarely assigns the correct non-zero level for uncertainty-based theories. For character anxiety, annotations are better calibrated than reader suspense across all theories: T3 achieves the best non-zero performance (NZ precision = 0. 179, NZ recall = 0. 325), and T1 the best rank-order agreement (Spearman r = 0. 521). These findings confirm the need for theory-specific prompt design for computationally operationalizing literary suspense and reveal a systematic score inflation bias rooted in the LLM's intrinsic suspense concept overriding the theoretical instructions of the prompt. Following Halterman and Keith (2025), who evaluate LLMs as measurement tools for theoretically defined concepts, the next steps of the project focus on comparing models of varying size and instruction-following capacity to identify the most suitable model for theory-guided literary annotation — one that follows the operationalized theory prompt rather than defaulting to its inherent conceptualization of suspense. References Carroll, Noël. 1997. “The Paradox of Suspense. ” In Suspense: Conceptualizations, Theoretical Analyses, and Empirical Explorations. De Ford, Miriam Allen. 1961. “Oh Rats!” Science fiction novella. Galaxy Magazine December 1961. https: //www. gutenberg. org/ebooks/51751. Gerrig, Richard. 1996. “Suspense in the Absence of Uncertainty. ” Journal of Memory and Language 28 (6): 633–48. 10. 1016/0749-596X (89) 90001-6. Guhr, Svenja. 2026. Suspense in Shorts. GitHub Repository. https: //github. com/SvenjaGuhr/SuspenseᵢnShorts. Guhr, Svenja. 2026. SuspenseLens. V. 1. 0. GitHub Repository. https: //github. com/SvenjaGuhr/SuspenseLens. Halterman, Andrew, and Katherine A. Keith. 2025. “Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts. ” Political Analysis 34 (2): 188–204. 10. 1017/pan. 2025. 10017. Iwata, Yumiko. 2009. “Creating Suspense and Surprise in Short Literary Fiction: A Stylistic and Narratological Approach. ” Doctoral Thesis, University of Birmingham. Smuts, Aaron. 2008. “The Desire-Frustration Theory of Suspense. ” Journal of Aesthetics and Art Criticism 66 (3): 281–90. 10. 1111/j. 1540-6245. 2008. 00309. x.
Building similarity graph...
Analyzing shared references across papers
Loading...
Svenja Guhr
Irem Kurtdemir
Hayden Nurnberg
Building similarity graph...
Analyzing shared references across papers
Loading...
Guhr et al. (Mon,) studied this question.
synapsesocial.com/papers/6a17dd723fad632b0f9da289 — DOI: https://doi.org/10.5281/zenodo.20389963