This study investigates whether a large language model (LLM) can perform governance-style mediation among multiple stakeholders when preferences are expressed only in categorical natural language. Building on prior conceptual work proposing an advisory governance layer for AI systems, we designed a controlled experiment comparing a language-based mediator with a numerical baseline (Borda count) across 1024 synthetic stakeholder scenarios, each executed ten times (10,240 paired decisions). Results show only 31% agreement with Borda, revealing distinct decision logic that produces equity-biased outcomes (68% improved fairness, ~25% Gini reduction, 38% higher minimum utility) at the cost of efficiency (14–20% lower mean utility). Stability analysis identified three reliability zones—stable (39%), middle (28%), and knife-edge (33%)—enabling risk-proportionate oversight. Qualitative analysis revealed that equity bias emerges from opaque pattern-matching followed by post hoc rationalization rather than systematic application of governance principles, with frequent semantic-grounding failures even in stable cases. These findings demonstrate that language-based mediation diverges fundamentally from numerical aggregation, suitable for advisory deliberation but requiring human oversight for value verification and factual accuracy.
Uchoa et al. (Mon,) studied this question.