Key points are not available for this paper at this time.
Modern software applications generate a wide range of runtime metrics, which are vital to many quality assurance activities. These data are often recorded and aggregated as time series to observe patterns and trends of various runtime aspects over time. In this context, Time Series Forecasting (TSF) offers unique opportunities for predicting software runtime behavior and identifying potential anomalies. Although TSF models have been successfully applied in fields such as economics and climatology, their capabilities for forecasting software runtime metrics remain relatively underexplored. In this paper, we conduct a comprehensive empirical evaluation of 8 TSF models on 110 real-world software runtime metrics recorded over the course of about one year. Our evaluation encompasses three classical statistical models, three neural network models, and two time series foundation models. Results show that the foundation models achieve state-of-the-art performance on TSF of software runtime metrics, outperforming other models with strong statistical significance. Our findings indicate that foundation models, despite being trained exclusively on time series data from other domains, can effectively generalize to software runtime metrics in a zero-shot setting. This makes them a convenient plug-and-play solution for practitioners and researchers aiming to integrate TSF into their software quality assurance processes. Yet, their performance is not uniformly superior across all the time series, underscoring the absence of a “ silver bullet ” solution.
Building similarity graph...
Analyzing shared references across papers
Loading...
Federico Di Menna
Luca Traini
Vittorio Cortellessa
Journal of Systems and Software
University of L'Aquila
Building similarity graph...
Analyzing shared references across papers
Loading...
Menna et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6a0808ffa487c87a6a40b0b4 — DOI: https://doi.org/10.1016/j.jss.2026.112937