What question did this study set out to answer?

This review aims to assess the effectiveness of automated methods in evaluating LLM persuasiveness in comparison to human judgments.

June 19, 2026Open Access

LLM Persuasiveness Evaluation: A Structured Review of Automated Methods

Key Points

This review aims to assess the effectiveness of automated methods in evaluating LLM persuasiveness in comparison to human judgments.
Reviewed thirty automated evaluation methods from twenty-seven papers.
Examined validation of automated methods against human judgment.
Discussed limitations and ethical challenges related to LLM evaluation.
Strong alignment with human judgment for argument assessment, but mixed results for belief and behavioral change metrics.
Automated methods are valuable for preliminary screening and testing high-risk scenarios.
Identified fragmentation in the field and inconsistent human validation of automated approaches.

Abstract

Large Language Models (LLMs) are becoming increasingly capable of persuading and even manipulating humans, with the potential to shape beliefs, behaviour, and public discourse at scale. These capabilities have been highlighted as malicious-use risks in the International AI Safety Report (Bengio et al. 2025), and increasingly impactful AI regulations now aim to assess and mitigate them while preserving potential benefits. It is therefore critical that LLM persuasiveness and related capabilities are thoroughly evaluated and well understood. To date, dozens of empirical studies have assessed LLM persuasion using human participants. Such evaluations are costly, logistically complex, hard to scale, and constrained by ethical challenges, making them impractical for the systematic evaluation of rapidly evolving LLMs. As an alternative, a growing body of work has proposed fully automated evaluation approaches that require no human involvement. In this structured review, we provide a systematic taxonomy of such automated approaches across 30 methods from 27 papers, examine how they are validated against human judgement, and discuss their limitations and risks. We find the field fragmented, and human validation limited and inconsistent: alignment with human judgement is strong for argument assessment but mixed for belief and behavioural change metrics, raising concerns about reliance on synthetic proxies for safety-critical assessment. Despite these limitations, automated methods offer value beyond fully replacing human studies, for applications such as preliminary screening and high-risk scenario testing. We conclude by outlining directions for future research in this fast-moving field, and accompany the review with a living open-access resource hub.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper