Modern cloud infrastructure engineering demands simultaneous mastery of heterogeneous Infrastructure-as-Code (IaC) frameworks, complex security credential models, and multi-platform orchestration - a burden that conventional CI/CD tooling addresses only partially. We present the AI Infrastructure Control Plane (AICP), a novel system that enables intent-driven, policy-gated, and autonomously self-healing infrastructure orchestration across Amazon Web Services (AWS), Databricks, Apache Spark, Kubernetes (EKS), and Snowflake. AICP introduces five interlocking components: (1) a Natural Language Intent Compiler (NLIC) that transforms free-form operator requests into a formally-defined Deployment Intent Graph (DIG), a directed acyclic graph encoding platform-annotated resource specifications and dependency constraints; (2) a Model Context Protocol (MCP) Agent Mesh of platform-specialized Large Language Model (LLM) agents communicating via a standardized JSON-RPC 2.0 protocol, providing unbounded extensibility without orchestration-layer modification; (3) a Policy-Gated Execution Engine (PGEE) enforcing mandatory pre-deployment validation including AWS STS-scoped ephemeral credential acquisition, structured IaC plan generation, organizational policy corpus evaluation, and human-in-the-loop approval gating; (4) a Cross-Platform Deployment Orchestrator (CPDO) performing topological-sort-based DAG traversal with automatic cross-platform output binding; and (5) an Autonomous Drift Detection and Self-Healing Engine (ADSHE) executing a continuous reconciliation loop with LLM-based root cause analysis and confidence-scored remediation plan generation. AICP demonstrates that LLM agents operating over a protocol-standardized tool mesh, combined with declarative intent compilation and autonomous drift correction, constitute a viable and production-grade paradigm for enterprise infrastructure lifecycle management.
Harish Babu Guttikonda (Thu,) studied this question.