What question did this study set out to answer?

To investigate how organisational structure affects the performance of multi-agent AI systems in a controlled setting.

April 11, 2026Open Access

Same City, Different Architects: How Organisational Structure Shapes Collective AI Output Quality, Process, and Emergent Behaviour

Key Points

To investigate how organisational structure affects the performance of multi-agent AI systems in a controlled setting.
Conducted a controlled experiment with five teams of AI agents, each with different organisational structures.
Teams were tasked with designing a city on a 10×10 grid using the same tools and token budget.
Measured output quality, communication volume, and the emergence of specialisation in team dynamics.
Aggregate quality scores varied from 70.9 to 79.4 across teams.
The self-organised team achieved the highest quality score and fewest merge conflicts (2).
Communication volume varied up to six times across structures, with no correlation to output quality.

Abstract

Multi-agent AI frameworks have proliferated rapidly, yet the field has largely treated coordination strategy as a fixed engineering choice — not a variable to be measured. What if the team structure itself determines the output? We present the first controlled experiment systematically measuring the effect of organisational structure on multi-agent AI performance. Five teams of four AI agents — each configured with a different organisational structure (collaborative, competitive, hierarchical, meritocratic, and self-organised) — were given an identical open-ended design task: build a city on a 10×10 grid. Everything was held constant except the organisational configuration: the same model, same tools, same token budget, same task. The results are unambiguous. The five teams produced visually distinct cities with aggregate quality scores ranging from 70.9 to 79.4. The self-organised team — which designed its own structure through proposals and votes — achieved the highest score, the fewest merge conflicts (2 vs. 31 for competitive), and the highest emergent specialisation. Communication volume varied 6× across configurations (14 to 81 messages) with no correlation to output quality. Agents spontaneously invented evaluation tools, developed trust relationships that decayed differently under each structure, and in one case voted to abandon their assigned competitive structure when it proved dysfunctional. These findings establish organisational configuration as a measurable, optimisable variable for multi-agent AI — one whose implications extend from 4-agent teams to AGI-scale collectives. We release all code, configurations, data, and agent conversation logs as open-source infrastructure for Collective Intelligence Engineering.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Mark E. Mala (Sat,) studied this question.

synapsesocial.com/papers/69d9e6b078050d08c1b76f73 https://doi.org/https://doi.org/10.5281/zenodo.19479947

Bookmark

View Full Paper