What question did this study set out to answer?

The aim is to evaluate the effectiveness of synonym-based attacks on BERT and demonstrate its robustness against such adversarial examples.

September 15, 2021Open Access

BERT is Robust! A Case Against Synonym-Based Adversarial Examples in Text Classification

Key Points

The aim is to evaluate the effectiveness of synonym-based attacks on BERT and demonstrate its robustness against such adversarial examples.
Evaluated four word substitution-based attacks on BERT.
Combined human evaluation with probabilistic analysis to assess semantic preservation.
Implemented a data augmentation technique and post-processing step to reduce adversarial attack success rates.
96% to 99% of attacks failed to preserve semantics, highlighting their inefficacy.
Post-processing reduced attack success rates to below 5%.
BERT's robustness is confirmed when applying reasonable thresholds for word substitutions.

Abstract

Deep Neural Networks have taken Natural Language Processing by storm. While led to incredible improvements across many tasks, it also initiated a new field, questioning the robustness of these neural networks by them. In this paper, we investigate four word substitution-based on BERT. We combine a human evaluation of individual word substitutions a probabilistic analysis to show that between 96% and 99% of the analyzed do not preserve semantics, indicating that their success is mainly on feeding poor data to the model. To further confirm that, we introduce efficient data augmentation procedure and show that many adversarial can be prevented by including data similar to the attacks during. An additional post-processing step reduces the success rates of-of-the-art attacks below 5%. Finally, by looking at more reasonable on constraints for word substitutions, we conclude that BERT is a more robust than research on attacks suggests.

KI fragen

Bookmark

View Full Paper