What question did this study set out to answer?

This research aims to create a comprehensive framework for benchmarking time-series forecasting models across diverse datasets.

March 22, 2026Open Access

A unified, large-scale, heterogeneous testbed for time-series forecasting

Key Points

This research aims to create a comprehensive framework for benchmarking time-series forecasting models across diverse datasets.
Implemented a unified testbed for time series containing over 100 datasets
Incorporated both univariate and multivariate series
Quantified explicit structural components such as trend and seasonality
Provided multiple temporal resolutions for extensive datasets
Introduced a data archive with up to 2 billion time points
Demonstrated a scalable framework for evaluating forecasting methods
Highlighted the importance of structural diversity in benchmarking

Abstract

Time series forecasting has witnessed rapid methodological evolution due to the increasing data availability in several domains such as energy, healthcare, transportation, and finance. Despite advances in forecasting models, including statistical approaches, deep neural networks, and foundation models, benchmarking remains restricted by the limited variety in datasets, the irregular structural complexity, and the lack of unified evaluation frameworks. This research addresses these limitations by presenting a large-scale, unified, and heterogeneous time series data archive that spans 12 domains, encompasses up to 2 billion time points, and offers multiple temporal resolutions. This research aims to design and implement a unified testbed that incorporates more than 100 datasets, balancing both univariate and multivariate series, and quantifies explicitly structural components such as trend, seasonality, stationarity, shifting, and transition. It goes beyond reflecting the plain scale and structural diversity of real-world time series, enabling a fair and reproducible evaluation framework of forecasting models. By providing both a full-scale and a computationally efficient down-sampled version, this work lays the foundations of a new era of benchmarking in time series forecasting. It offers in the research community a scalable and extensible infrastructure, designed to accelerate the development of robust, adaptable and generalizable prediction models that are aligned with real-world complexities.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper