As enterprise adoption of large language model (LLM) -based agents accelerates, practitioners face a critical gap: no systematic framework exists for characterizing how domain specialization, coordination topology, context persistence, authority boundaries, and escalation protocols interact across production deployments. This paper makes three contributions. First, we propose a five- dimensional taxonomy for domain-specialized agent systems derived from foundational multi- agent systems literature. Second, we apply this taxonomy to a corpus of 26 real-world systems — cspanning open-source frameworks, commercial platforms, and documented production deployments — observed between 2024 and 2026. Third, we derive nine design principles from cross-cutting empirical patterns, including the finding that 96. 2% of production systems implement formal escalation protocols, while the single documented system lacking escalation machinery suffered a 3. 2M fraud incident. Our analysis reveals that functional specialization (50. 0%) and hierarchical coordination (42. 3%) dominate current enterprise deployments, that long-horizon context persistence is increasingly standard (53. 8%), and that advisory authority levels represent a deliberate governance constraint in high-stakes regulated domains. These findings carry direct implications for practitioners designing enterprise agent systems and for researchers characterizing the emerging production landscape.
Sebastian Kirsch (Sun,) studied this question.