Los puntos clave no están disponibles para este artículo en este momento.
Natural Language Processing models are now being used more than ever before and they have been quite remarkable in applications like sentiment classification, text generation, generative AI, etc. The deep learning models behind these applications of Natural Language Processing are the reason why these models can understand the complex relation between the variables and apply those learning for the applications of NLP. But, similar to all the other applications of deep learning, NLP models are also equally susceptible to external attacks which can hamper the accuracy and the credibility of the model, which in the end poses a big question on the reliability of these models. Textual adversarial attacks can deviate the model from giving the correct answer to giving an incorrect output, just because of a small perturbation made in the input. These perturbations, often involving synonyms of the original word, are still able to successfully fool these models. This paper, in a similar context, proposes BEACOMP, a novel textual adversarial attack that uses composite transformation functions from the TextAttack library, in a black box setting, to successfully fool the model. The accuracy and the success rate of the attack are validated through three different models Distilled BERT, BERT uncased, and RoBERTa, on 4 different datasets including IMDB, SST2, Rotten Tomatoes, and AG news. The use of the attack proposed showed an increase in the attack success rate, which in the end improves the model's preparedness for adversarial attacks. On average, for the models tested in the research, their accuracy dropped from 97.16% to 1.91%. This demonstrates the effectiveness of the TextAttack library in generating attack patterns for text classification patterns while adhering to certain constraints.
Sharma et al. (Thu,) studied this question.