We test whether populations of large language model (LLM) agents reproduce the quantitative predictions of classical evolutionary game theory in three canonical 2 × 2 games Hawk–Dove, Stag Hunt, and a pure coordination game—across five network topologies (complete, Erd˝os–R´enyi, Barab´asi–Albert, Watts–Strogatz, and a 2D lattice). The Hawk–Dove game provides the headline test: classical theory (Maynard Smith, 1982) predicts an evolutionarily stable strategy (ESS) at Hawk frequency x∗H = V/C, a prediction confirmed for biological populations and for reinforcement learning agents on networks but not yet, to our knowledge, for frontier LLMs. We complement this with a Stag Hunt experiment that probes whether the Pareto-selection rule observed in dyadic LLM bargaining (Drakos, 2026) survives at population scale, and with a pure coordination experiment that benchmarks convention emergence. We formulate four falsifiable propositions and report empirical findings from ∼37,500 game decisions across 150 independent runs. Three findings emerge. First, frontier LLM populations approximate the Hawk–Dove ESS at V/C = 2/3 on the complete graph K25 (5-seed mean 0.643, 95% CI 0.612, 0.674 containing the theoretical 0.667) but the equilibrium frequency depreciates systematically as network sparseness or clustering increases(one-way ANOVA across the five topologies: F(4, 20) = 7.19, p = 0.0009). Second, a V/C-parameter sweep across five additional ratios reveals an attenuated parametric response: the empirical Hawk frequency tracks the theoretical V/C with slope 0.50 ± 0.04 (95% CI 0.43, 0.58; R2 = 0.83), strongly rejecting the slope-1 null (t(38) = −13.35, p < 10−15). The LLM population compresses toward 0.5 action balance regardless of payoff parameters, over-Hawking below V/C = 0.5 and under-Hawking above. Third, in Stag Hunt all twenty-five runs reach 100% Stag (perfect Pareto-selection), and in pure coordination the population deterministically selects whichever action is listed first in the prompt enumeration (29/29 runs).
Stefanos Drakos (Thu,) studied this question.