Predictive maintenance (PdM) is a key application of the industrial IoT, using sensor-acquired time-series data to forecast equipment failures that cost billions of dollars annually. Time-series anomaly detection (TSAD) offers a promising route to PdM, yet prior work is constrained by: i) reliance on private datasets or case studies, requiring extensive domain-specific engineering; ii) evaluation of only a small number of algorithms, limiting comparative insight; iii) use of a single implementation strategy per algorithm, hindering potential improvements; iv) emphasis on supervised solutions, necessitating annotated data that are costly or unavailable in practice; and v) neglect of runtime and online applicability, raising questions about deployability. To address these limitations, we present an extensive experimental study with rigorous statistical analysis conducted within a common evaluation framework. Our objectives are to: i) provide insights into the accuracy, robustness, and behavior of TSAD techniques in an online PdM setting across four implementation strategies; ii) analyze effectiveness–runtime trade-offs; and iii) rate dataset difficulty and characterize the forms exhibited by anomalies preceding machine breakdowns. Our findings indicate that most industrial cases benefit from an initial calibration phase for operational data collection. Well-established traditional TSAD methods deliver the best trade-off among effectiveness, runtime efficiency, and the ability to predict individual failure types; pre-trained LLMs are still statistically outperformed in this context, suggesting that there is still room for improvement in that direction. Our study serves as a significant step towards disseminating TSAD for PdM and we open source our work to support future research.
Papadopoulos et al. (Mon,) studied this question.