What question did this study set out to answer?

This research aims to assess the effectiveness of multi-agent large language models in various Process Systems Engineering tasks.

March 18, 2026Open Access

Improving process systems engineering with specialized multi-agent large language models

Key Points

This research aims to assess the effectiveness of multi-agent large language models in various Process Systems Engineering tasks.
Compared two multi-agent systems (Dyad 2.1 and Claude Opus 4.6) with single-agent baselines (ChatGPT 5.2 and Google Gemini Pro 3).
Conducted three case studies focusing on soft sensing, mechanistic modeling, and nonlinear model predictive control.
Developed models for ATR–FTIR calibration, population balance, and NMPC for batch crystallization.
LLMs achieved near-linear R2 values close to unity in ATR–FTIR calibration models.
Multi-agent systems showed stronger validation performance in population balance models for potassium sulfate dissolution.
NMPC formulations successfully regulated mean size and crystal mass, reaching set-points effectively under various scenarios.

Abstract

Multi-agent large language models (LLMs) were evaluated for Process Systems Engineering (PSE) tasks. Two multi-agent systems (Dyad 2.1 and Claude Opus 4.6) were compared with two single-agent baselines (ChatGPT 5.2 and Google Gemini Pro 3) in three crystallization-centered case studies spanning soft sensing, mechanistic modeling, and nonlinear model predictive control (NMPC). In Case Study 1, ATR–FTIR calibration models were built to predict paracetamol mole fraction from spectra, temperature, and solvent composition. All LLMs converged to latent-variable linear chemometric models and achieved near-linear parity with R 2 close to unity across training, validation, and test sets. In Case Study 2, all systems reconstructed a moment-based population balance model (PBM) coupled to a solute mass balance for potassium sulfate dissolution and seeded crystallization, but robustness differed. For dissolution, multi-agent workflows maintained stronger validation performance, while the single-agent baseline showed the largest degradation, including strongly reduced R 2 for the moments of crystal size distribution (CSD). For crystallization, only the multi-agent PBMs reproduced the dynamics consistently and were retained for model-updating tests under dataset shift. In Case Study 3, LLMs proposed NMPC formulations for batch crystallization of potassium sulfate, which temperature was manipulated to regulate mean size L ̄ 10 and crystal mass m under constraints. All NMPCs were implementable at a 1 min sampling time, achieved near set-point tracking with moderate control effort over five scenarios, and Dyad provided the best overall closed-loop performance. • Specialized multi-agent LLMs were evaluated on chemical engineering tasks. • LLMs were applied to develop solutions for sensing, modeling, and nonlinear control. • ATR-FTIR calibration models achieve R 2 > 0 . 97 with automated feature selection. • Population balance models are iteratively refined and recover correct equilibrium behavior. • NMPC formulations reach set-points with computation times below 20 s per control move.

Improving process systems engineering with specialized multi-agent large language models

Key Points

Abstract

Cite This Study