Patch-based Transformer models have gained widespread adoption, achieving state-of-the-art performance across various domains that involve multi-dimensional spatiotemporal data, such as, for example, in vision tasks. Recently, they have emerged as a promising alternative for multivariate time-series forecasting, where each univariate series is treated as a separate channel, while sharing the same embedding and Transformer weights. In this work, we further explore the capabilities of patch-based Transformers in the context of forecasting a single time series, specifically focusing on energy consumption prediction. Our primary interest lies in long-term forecasting, a relatively under-explored area in the literature. To this end, we evaluate Transformer-based models on two energy consumption datasets—one public and one private—and assess their performance. We argue that leveraging patches or patching-like techniques can significantly enhance model efficiency. Lastly, we discuss the current limitations of Transformer-based architectures and propose potential solutions.
Karpontinis et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: