June 17, 2026Open Access

The Systems Architecture of LLM Multi-Agent Systems: Routing, Memory, and Resource Optimization

Key Points

Key points are not available for this paper at this time.

Abstract

Large Language Model (LLM)-based multi-agent systems have emerged as a promising paradigm for solving complex reasoning and decision-making tasks through coordinated agent collaboration. However, scaling such systems introduces significant challenges related to communication overhead, token consumption, memory management, orchestration efficiency, and operational cost. This survey presents a systematic architectural analysis of modern LLM multi-agent systems, focusing on five interconnected dimensions: orchestration topologies, routing mechanisms, distributed state management, resource optimization, and failure attribution. The paper synthesizes recent research on dynamic Directed Acyclic Graph (DAG) routing, semantic communication pathways, decentralized agent swarms, context virtualization techniques, tiered memory architectures, budget-aware routing strategies, and systems-level benchmarking methodologies. Additionally, the survey discusses emerging research directions including hardware-aware agent routing and automated failure attribution, highlighting the growing importance of infrastructure-aware orchestration for production-scale agentic AI systems. This work provides a structured taxonomy and comprehensive review intended for researchers, practitioners, and students working on large-scale multi-agent AI architectures and agentic systems.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper