What question did this study set out to answer?

The aim is to develop a taxonomy for domain-specialized agent systems and evaluate their production characteristics.

February 24, 2026Open Access

Domain-Specialized Agent Systems in Enterprise AI: A Taxonomy and Empirical Analysis of 26 Production Systems

Key Points

The aim is to develop a taxonomy for domain-specialized agent systems and evaluate their production characteristics.
Developed a five-dimensional taxonomy for agent systems.
Analyzed 26 real-world production systems including open-source and commercial platforms.
Identified design principles from empirical observations across the systems.
96.2% of systems implement formal escalation protocols.
50.0% of systems exhibit functional specialization.
42.3% show hierarchical coordination.
53.8% are utilizing long-horizon context persistence.
The absence of escalation protocols correlated with a $3.2M fraud incident.

Abstract

As enterprise adoption of large language model (LLM) -based agents accelerates, practitioners face a critical gap: no systematic framework exists for characterizing how domain specialization, coordination topology, context persistence, authority boundaries, and escalation protocols interact across production deployments. This paper makes three contributions. First, we propose a five- dimensional taxonomy for domain-specialized agent systems derived from foundational multi- agent systems literature. Second, we apply this taxonomy to a corpus of 26 real-world systems — cspanning open-source frameworks, commercial platforms, and documented production deployments — observed between 2024 and 2026. Third, we derive nine design principles from cross-cutting empirical patterns, including the finding that 96. 2% of production systems implement formal escalation protocols, while the single documented system lacking escalation machinery suffered a 3. 2M fraud incident. Our analysis reveals that functional specialization (50. 0%) and hierarchical coordination (42. 3%) dominate current enterprise deployments, that long-horizon context persistence is increasingly standard (53. 8%), and that advisory authority levels represent a deliberate governance constraint in high-stakes regulated domains. These findings carry direct implications for practitioners designing enterprise agent systems and for researchers characterizing the emerging production landscape.

Bookmark

View Full Paper