Abstract With the rapid advancement of deep neural networks in wireless communications, applications such as signal modulation recognition and target detection face threats from adversarial example attacks. To enhance system robustness against adversarial attacks, adversarial example detection holds a unique position and role as a complementary approach to conventional adversarial defense methods. This paper investigates the spatial and frequency domain attribute differences between clean and adversarial signal examples, proposing a joint spatial-frequency domain adversarial example detection method for signal modulation recognition networks. In the frequency domain, we extract time-shifted autocorrelation features that capture the peak width differences between clean and adversarial examples, where adversarial perturbations exhibit wider autocorrelation peaks due to their signal-like energy distribution. In the spatial domain, we characterize the inter-layer feature propagation patterns through DNN layers by computing cosine similarities between layer-wise activations and class centers, revealing that adversarial examples exhibit progressive deviation from their true class in deeper layers. These complementary dual-domain features are then fused and classified through a Random Forest ensemble to achieve robust adversarial detection. Experimental results show that the proposed method achieves an adversarial detection rate of 90.32% with an AUC of 0.9475 under PGD attacks, substantially outperforming autoencoder-based and KL-divergence-based baseline detectors by 22.20% and 4.36% respectively. The detector also maintains robust performance across different attack types, achieving detection rates of 98.82% against FGSM and 99.36% against CW attacks. These results validate that the proposed method serves as an effective frontline defense to enhance the adversarial robustness of signal modulation recognition networks.
Liu et al. (Mon,) studied this question.