March 29, 2024Open Access

A Comparative Analysis of Large Language Models to Evaluate Robustness and Reliability in Adversarial Conditions

Key Points

Key points are not available for this paper at this time.

Abstract

This study went on a comprehensive evaluation of four prominent Large Language Models (LLMs) -Google Gemini, Mistral 8x7B, ChatGPT-4, and Microsoft Phi-1.5 -to assess their robustness and reliability under a variety of adversarial conditions.Utilizing the Microsoft PromptBench dataset, the research investigates each model's performance against syntactic manipulations, semantic alterations, and contextually misleading cues.The findings reveal notable differences in model resilience, highlighting the distinct strengths and weaknesses of each LLM in responding to adversarial challenges.Comparative analysis underscores the necessity for multifaceted evaluation approaches to enhance model resilience, suggesting future research directions involving the augmentation of training datasets with adversarial examples and the exploration of advanced natural language understanding algorithms.This study contributes to the ongoing discourse in LLM research by providing insights into model vulnerabilities and advocating for comprehensive strategies to bolster LLM robustness against the evolving landscape of adversarial threats.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Goto et al. (Fri,) studied this question.

www.synapsesocial.com/papers/68e71ba3b6db643587695736 — DOI: https://doi.org/10.36227/techrxiv.171173447.70655950/v1

Authors

Takeshi Goto

Kensuke Ono

Akira Morita

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A Comparative Analysis of Large Language Models to Evaluate Robustness and Reliability in Adversarial Conditions

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion