Phishing attacks remain as a significant cybersecurity threat, aiming to steal sensitive information by exploiting human vulnerability. Traditional phishing email detection often struggles to keep up with the latest attack strategies developed by the attackers which results in high false positive rates and the limited contextual understanding on the email contents. Therefore, to address these challenges, this research proposes a solution via an AI-powered threat-hunting model integrating Natural Language Processing (NLP) techniques for phishing email detection in English through the development of PhishGuard AI application. The application is developed as a web-based software solution designed to be accessible to both users with and without technical expertise. The model leverages Word2Vec with TF-IDF weighting for feature extraction and uses an XGBoost classifier. A comprehensive testing process using various metrics will evaluate the computational efficiency and effectiveness of the model. The model's robustness and generalisability were rigorously tested using two distinct datasets which are CEAS₀8. csv for in-distribution training and SpamAssasin. csv for out-of-distribution evaluation. The primary value of this model lies in its proactive threat-hunting capability, which distinguishes it from reactive systems that rely on known threat examples. The findings derived from the study aim to enhance to the domain of phishing email detection and contributing to the development of a more robust cybersecurity solution that can help in safeguarding both the individuals and organisations safety in our country.
Kumaresan et al. (Sun,) studied this question.