What question did this study set out to answer?

To develop an interpretable deep learning framework for detecting AI-generated phishing emails based on psychological principles.

December 8, 2025Open Access

A Two-Stage Deep Learning Framework for AI-Driven Phishing Email Detection Based on Persuasion Principles

Key Points

To develop an interpretable deep learning framework for detecting AI-generated phishing emails based on psychological principles.
Created a dataset of 2995 phishing emails labeled with persuasion principles.
Employed a fine-tuned DistilBERT model for principle prediction.
Used a dense neural network for final binary classification of emails.
Achieved 94% accuracy and 98% AUC in phishing detection.
Identified authority and scarcity as strong indicators of phishing emails.

Abstract

AI-generated phishing emails present a growing cybersecurity threat, exploiting human psychology with high-quality, context-aware language. This paper introduces a novel two-stage detection framework that combines deep learning with psychological analysis to address this challenge. A new dataset containing 2995 GPT-o1-generated phishing emails, each labelled with Cialdini’s six persuasion principles, is created across five organisational sectors—forming one of the largest and most behaviourally annotated corpora in the field. The first stage employs a fine-tuned DistilBERT model to predict the presence of persuasion principles in each email. These confidence scores then feed into a lightweight dense neural network at the second stage for final binary classification. This interpretable design balances performance with insight into attacker strategies. The full system achieves 94% accuracy and 98% AUC, outperforming comparable methods while offering a clearer explanation of model decisions. Analysis shows that principles like authority, scarcity, and social proof are highly indicative of phishing, while reciprocation and likeability occur more often in legitimate emails. This research contributes an interpretable, psychology-informed framework for phishing detection, alongside a unique dataset for future study. Results demonstrate the value of behavioural cues in identifying sophisticated phishing attacks and suggest broader applications in detecting malicious AI-generated content.

A Two-Stage Deep Learning Framework for AI-Driven Phishing Email Detection Based on Persuasion Principles

Key Points

Abstract

Cite This Study

Also Consider

Also Consider