Buildings account for a substantial share of global energy use, yet the adoption of advanced optimal control strategies remains limited due to high computational costs and the difficulty of safe deployment. This paper presents a fully Python-based, data-driven deep reinforcement learning (DRL) supervisory control framework that leverages gray box surrogate modeling and Imitation Learning to overcome these barriers. The novelty of this work lies in the integration of an ontology-based Twin4Build surrogate model with Imitation Learning and Deep Reinforcement Learning, enabling efficient training of building control policies in a low-cost environment before transfer to a high-fidelity BOPTEST emulator. Results demonstrate that the trade-off of using a lower-accuracy surrogate accelerates training by a factor of 11 compared to high-fidelity models. Furthermore, the RL agent successfully learned load-shifting and peak-shaving strategies, eliminating start-up power spikes and achieving energy savings of up to 28.9%. Beyond substantial energy reductions, this pipeline yields a calibrated digital twin suitable for ongoing building services like anomaly detection, presenting a scalable path for real-world smart building optimization.
Cubides et al. (Fri,) studied this question.