What type of study is this?

This is a Systematic Review study.

October 3, 2025Open Access

Attack and defense techniques in large language models: A survey and new perspectives

Key Points

The survey reveals vulnerabilities in large language models and highlights critical security challenges.
Attacks are classified into adversarial prompt attacks and model theft, outlining their mechanisms and implications.
Defense strategies include prevention-based and detection-based methods, emphasizing the importance of adaptability.
Challenges in implementing defenses are discussed, including the need for explainable security techniques and standardized evaluations.

Abstract

Large Language Models (LLMs) have become central to numerous natural language processing tasks, but their vulnerabilities present significant security and ethical challenges. This systematic survey explores the evolving landscape of attack and defense techniques in LLMs. We classify attacks into adversarial prompt attack, optimized attacks, model theft, as well as attacks on application of LLMs, detailing their mechanisms and implications. Consequently, we analyze defense strategies, including prevention-based and detection-based defense methods. Although advances have been made, challenges remain to adapt to the dynamic threat landscape, balance usability with robustness, and address resource constraints in defense implementation. We highlight open problems, including the need for adaptive scalable defenses, explainable security techniques, and standardized evaluation frameworks. This survey provides actionable insights and directions for developing secure and resilient LLMs, emphasizing the importance of interdisciplinary collaboration and ethical considerations to mitigate risks in real-world applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Liao et al. (Thu,) studied this question.

synapsesocial.com/papers/68e03501f0e39f13e7fa39a5 — DOI: https://doi.org/10.48550/arxiv.2505.00976

Authors

Zhongxing Liao

The University of Texas MD Anderson Cancer Center

Kang Chen

Southwest Petroleum University

Yuanjie Lin

University of Shanghai for Science and Technology

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Attack and defense techniques in large language models: A survey and new perspectives

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion