Time series forecasting has witnessed rapid methodological evolution due to the increasing data availability in several domains such as energy, healthcare, transportation, and finance. Despite advances in forecasting models, including statistical approaches, deep neural networks, and foundation models, benchmarking remains restricted by the limited variety in datasets, the irregular structural complexity, and the lack of unified evaluation frameworks. This research addresses these limitations by presenting a large-scale, unified, and heterogeneous time series data archive that spans 12 domains, encompasses up to 2 billion time points, and offers multiple temporal resolutions. This research aims to design and implement a unified testbed that incorporates more than 100 datasets, balancing both univariate and multivariate series, and quantifies explicitly structural components such as trend, seasonality, stationarity, shifting, and transition. It goes beyond reflecting the plain scale and structural diversity of real-world time series, enabling a fair and reproducible evaluation framework of forecasting models. By providing both a full-scale and a computationally efficient down-sampled version, this work lays the foundations of a new era of benchmarking in time series forecasting. It offers in the research community a scalable and extensible infrastructure, designed to accelerate the development of robust, adaptable and generalizable prediction models that are aligned with real-world complexities.
Κυριακή Ε. Ποταμοπούλου (Wed,) studied this question.