What type of study is this?

September 10, 2025

Explainability-Driven Defense: Grad-CAM-Guided Model Refinement Against Adversarial Threats

Puntos clave

The implementation of masking strategies enhances convolutional neural networks' resilience to adversarial attacks.
Gaussian-blurred masking shows the highest increase in model accuracy against the projected gradient descent method.
Integration of Grad-CAM insights aids in refining network features by focusing on high-activation areas.
Binary and difference-based masking consistently improve accuracy across varying levels of adversarial perturbation.

Resumen

Deep learning models have excelled in tasks like image recognition and autonomous systems but remain vulnerable to adversarial attacks and spurious correlations, limiting their reliability in real-world and safety-critical settings. To address these challenges, we propose a novel framework that leverages explainable Artificial Intelligence (XAI) to enhance the robustness of Convolutional Neural Networks. Our approach integrates Grad-CAM insights into the model refinement process, guiding feature masking to reduce reliance on irrelevant or misleading features. We introduce three masking strategies: (1) binary masking to retain high-activation regions, (2) Gaussian-blurred masking to preserve contextual information while reducing noise, and (3) difference-based masking to remove unstable features unique to the baseline model. We evaluate these strategies against two common adversarial attack methods—Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). Results show that all three strategies improve FGSM accuracy, with binary and difference-based masking providing consistent gains across perturbation levels. Gaussian-blurred masking delivers the highest improvement in PGD accuracy, particularly at higher perturbation strengths.

Me gusta

Guardar