What type of study is this?

This is a Qualitative Study study.

What question did this study set out to answer?

This research examines the performance of a language model in governance mediation among stakeholders.

January 6, 2026Open Access

Natural-Language Mediation Versus Numerical Aggregation in Multi-Stakeholder AI Governance: Capability Boundaries and Architectural Requirements

Key Points

This research examines the performance of a language model in governance mediation among stakeholders.
Controlled experiment comparing language-based mediation with numerical baseline (Borda count)
Analysis of 1024 synthetic stakeholder scenarios
Evaluation of decision outcomes across multiple iterations
Qualitative analysis of decision-making processes
31% agreement with Borda count
68% improved fairness and ~25% Gini reduction
38% higher minimum utility with language-based mediation
14–20% lower mean utility in equity-biased outcomes

Abstract

This study investigates whether a large language model (LLM) can perform governance-style mediation among multiple stakeholders when preferences are expressed only in categorical natural language. Building on prior conceptual work proposing an advisory governance layer for AI systems, we designed a controlled experiment comparing a language-based mediator with a numerical baseline (Borda count) across 1024 synthetic stakeholder scenarios, each executed ten times (10,240 paired decisions). Results show only 31% agreement with Borda, revealing distinct decision logic that produces equity-biased outcomes (68% improved fairness, ~25% Gini reduction, 38% higher minimum utility) at the cost of efficiency (14–20% lower mean utility). Stability analysis identified three reliability zones—stable (39%), middle (28%), and knife-edge (33%)—enabling risk-proportionate oversight. Qualitative analysis revealed that equity bias emerges from opaque pattern-matching followed by post hoc rationalization rather than systematic application of governance principles, with frequent semantic-grounding failures even in stable cases. These findings demonstrate that language-based mediation diverges fundamentally from numerical aggregation, suitable for advisory deliberation but requiring human oversight for value verification and factual accuracy.

Natural-Language Mediation Versus Numerical Aggregation in Multi-Stakeholder AI Governance: Capability Boundaries and Architectural Requirements

Key Points

Abstract

Cite This Study