What type of study is this?

This is a Quantitative Study study.

October 3, 2025Open Access

Adversarial Attacks Detection Method for Tabular Data

Key Points

The method efficiently detects both known and unknown adversarial attacks with high balanced accuracy.
Analysis on 22 datasets revealed low false negative rates between 0.02 and 0.10 in binary detection.
Using a surrogate model helps to improve detection of subtle adversarial attacks on machine learning models.
The approach emphasizes the need for effective defenses against threats to the integrity of machine learning.

Abstract

Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The paper proposes a new approach for detecting adversarial attacks using a surrogate model and diagnostic attributes. The method was tested on 22 tabular datasets on which four different ML models were trained. Furthermore, various attacks were conducted, which led to obtaining perturbed data. The proposed approach is characterized by high efficiency in detecting known and unknown attacks—balanced accuracy was above 0.94, with very low false negative rates (0.02–0.10) for binary detection. Sensitivity analysis shows that classifiers trained based on diagnostic attributes can detect even very subtle adversarial attacks.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper