Industrial processes must operate robustly in unpredictable environments, where errors are costly and difficult to detect. AI-based control systems offer a path forward but typically rely on large, labeled datasets, limiting their generalization to variable, data-scarce settings. Foundation models promise broader reasoning and knowledge integration yet struggle to deliver the quantitative precision required in engineering. Here, we introduce Control and Interpretation of Production via Hybrid Expertise and Reasoning (CIPHER): a systems-level vision-language-action (VLA) framework designed for industrial perception, explanation and control. CIPHER integrates a process expert for quantitative characterization of system states with retrieval-augmented reasoning grounded in process physics and knowledge. This hybrid design enables strong generalization to out-of-distribution tasks, allowing the agent to interpret textual or visual inputs, explain its decisions, and autonomously generate precise machine instructions without explicit supervision. In this work, CIPHER is deployed within multiple manufacturing systems, demonstrating precise, context-aware, and transparent control, demonstrating potential for deployment in real industrial environments.
Pattinson et al. (Thu,) studied this question.