This paper presents preliminary descriptive findings from a multi-day study of human detection accuracy for AI-generated phishing emails across six technique categories. Over 25 calendar days (March 6 to 30, 2026), 153 participants completed 2,511 binary classification tasks on the Threat Terminal platform, a gamified web-based research instrument hosted at research.scottaltiparmak.com. A mid-study protocol revision on March 22 removed email authentication headers, creating two analytical phases. In Phase 2 (the primary analysis), overall classification accuracy was 85.9%. Technique miss rates ranged from 12.5% (credential harvest) to 20.5% (authority impersonation). A critical overconfidence pattern was observed: 60.5% of phishing misclassifications occurred at the highest confidence level. Professional background was associated with modest accuracy differences (86.7% vs. 82.2%, Cohen's h = 0.126). All findings are descriptive; the formal mixed-effects model will be reported in a subsequent analytical paper. This is a companion to the published protocol paper (DOI: 10.5281/zenodo.19059296).
Scott Altiparmak (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: