ABSTRACT Cross‐Site Scripting (XSS) continues to pose serious risks to modern web applications as attackers increasingly employ sophisticated obfuscation and adversarial manipulations. Traditional rule‐based and machine learning detectors often fail to model multi‐level semantic patterns, making them vulnerable to well‐crafted evasive payloads. To address this challenge, this paper proposes an adversarially robust CNN–BiLSTM–Multi‐Head Attention model that enhances contextual feature extraction and strengthens resistance to gradient‐driven perturbations. Convolutional layers capture local token interactions, BiLSTM modules learn long‐range dependencies, and the attention mechanism provides token‐level interpretability and prioritized feature weighting. FGSM‐based adversarial training further exposes the model to worst‐case perturbation directions, improving its generalization to evolving XSS variants. Experiments conducted on a composite dataset of 29 000 real‐world, benchmark, and adversarial samples demonstrate the effectiveness of the proposed framework. The system achieves 97.8% accuracy, 96.5% precision, 97.1% recall, and a 98.3% AUC score, surpassing existing detectors such as FusionXSS and XSShield. Robustness evaluations show gains of up to 17% under FGSM attacks. With low inference latency and moderate memory usage, the model offers a practical and scalable solution for real‐time XSS defense in modern web environments.
Bharath et al. (Sun,) studied this question.