What does this research mean for the field?

MCP layering in AI agent workflows reduces token usage by 47% and improves accuracy by 37% on commodity hardware. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.ESTABLISHES_NEW_DIRECTION.

What question did this study set out to answer?

This research aims to establish a systematic reference model for AI agent workflows using Pylon-7.

March 1, 2026Open Access

Pylon-7: A 7-Layer Reference Model for AI Agent Workflows -- Exploratory Study on MCP Layering for Efficiency, Accuracy, and Safety in Commodity Hardware Environments

Key Points

This research aims to establish a systematic reference model for AI agent workflows using Pylon-7.
Proposed a 7-layer reference model for AI agents inspired by the OSI model.
Conducted 615 experimental runs across 10 scenarios using Qwen 2.5 and GPT-OSS 20B models.
Explored efficiency, accuracy, and safety at 5 MCP depth levels (L0-L4) on CPU-only systems.
MCP layering reduced token usage by 47% while increasing accuracy by 37%.
Layer 3 (L3) provided optimal structured output and candidate actions.
A 7B model combined with MCP was 3.5x cheaper and 14.4% more accurate than using a 20B model alone.
Tasks previously deemed impossible achieved perfect performance with the MCP framework.
Above L3, both models maintained an accuracy of 0.93+, indicating MCP's structural advantages.

Abstract

Large Language Model (LLM)-based AI agents have advanced beyond simple text generation to perform real-world tasks using external tools. However, there remains no systematic reference model for analyzing agent workflows. This paper proposes Pylon-7, a 7-layer reference model inspired by the OSI model that decomposes AI agent workflows into seven independent layers, positioning the Model Context Protocol (MCP) as the inter-layer gateway. We conducted 615 experimental runs (465 main + 50 L2.5 ablation + 100 L3 component decomposition ablation) across 10 infrastructure operation scenarios at 5 MCP depth levels (L0-L4) using Qwen 2.5 7B and GPT-OSS 20B models on commodity hardware (CPU-only, no GPU) to explore the impact of MCP layering on efficiency (token cost), accuracy (task quality), and safety (privilege control). Key findings: (1) MCP layering (L0→L3) reduced tokens by 47% while improving accuracy by 37%. (2) L3 (structured output + candidate actions) is the sweet spot. (3) A small model (7B)+MCP combination was 3.5x cheaper and 14.4%p more accurate than a large model (20B) alone. (4) Tasks practically impossible without MCP achieved perfect performance with MCP. (5) Above L3, both 7B and 20B models achieved 0.93+ accuracy, suggesting MCP structure may dominate over model size.

Pylon-7: A 7-Layer Reference Model for AI Agent Workflows -- Exploratory Study on MCP Layering for Efficiency, Accuracy, and Safety in Commodity Hardware Environments

Key Points

Abstract

Cite This Study