Key points are not available for this paper at this time.
Data driven methods are the most studied fault detection and diagnostics (FDD) type in buildings HVAC systems. However, most studies rely on labeled data for specific faults which are hard to find and collect for real systems. While the fault-free data is easier to collect, it is still time consuming to label for large systems operation. Moreover, most of the studies rely on the usage of supervised learning algorithms which do not generalize well beyond the training data making unseen faults hard to detect. In this paper, we define a methodology to use a self-supervised learning method for HVAC systems' FDD using a Transformer encoder, moreover, we tested it on a real case study. By strategically masking portions of the multivariate time-series data using Markov chain approach with two states. The model is trained by predicting these concealed segments. This approach, independent of labeled data, offers a scalable solution for practical HVAC applications. Anomalies are labeled using the Peak Over Threshold (POT) method, which dynamically determines thresholds by fitting reconstruction errors to a generalized Pareto distribution. Subsequent fault diagnostics emphasize features with pronounced reconstruction errors, pinpointing potential HVAC malfunctions. This methodology reduces dependence on labeled datasets and augments the model's generalization, facilitating detection of unobserved faults. This approach was applied to data from a real building. As a results multiple faults were detected mainly due to the malfunctioning of the monitoring system. The model demonstrates the ability to detect both sequential and individual faults. The period from October 19th to December 23rd was detected as a fault period due to the change in the trend of the data because of the monitoring system. • Introduced a self-supervised learning method for HVAC fault detection using a Transformer encoder. • Employed strategic data masking with a Markov chain approach, eliminating the need for labeled data. • Utilized Peak Over Threshold method for dynamic, data-driven anomaly threshold determination. • Demonstrated the model's efficacy in a real-world case study, detecting multiple HVAC faults. • The proposed method enhances fault detection accuracy and generalizability, reducing reliance on labeled datasets.
Abdollah et al. (Wed,) studied this question.