What question did this study set out to answer?

The study investigates whether transformer-based models outperform classical methods in time series forecasting.

May 17, 2026Open Access

Transformer-Based Models for Time Series Forecasting: Are They Better Than Classical Methods?

Puntos clave

The study investigates whether transformer-based models outperform classical methods in time series forecasting.
Compared four transformer variants with three classical methods and an LSTM baseline across six publicly available datasets.
Evaluated performance across short, medium, and long forecasting horizons.
Assessed models under controlled conditions to mitigate benchmark bias.
PatchTST and Temporal Fusion Transformer showed significant improvements on long-horizon multivariate problems (p<0.05).
Classical methods excelled on short horizons and seasonally dominated series, achieving competitive accuracy with calibrated metrics.
Findings indicate that simple linear models can match or exceed several transformer variants.

Resumen

The rapid success of transformer architectures in natural language processing has spurred a wave of research adapting them to time series forecasting, with new variants — Informer, Autoformer, FEDformer, PatchTST, and the Temporal Fusion Transformer — appearing almost every quarter. This paper asks a deliberately narrow question: under controlled conditions, do these transformer-based models actually outperform well-tuned classical forecasting methods such as ARIMA, exponential smoothing, and Prophet, or does the apparent advantage rest on benchmark choices and reporting conventions? Using six publicly available datasets spanning electricity, traffic, weather, retail demand, and finance, the study compares four transformer variants with three classical methods and an LSTM baseline across short, medium, and long forecasting horizons. Results indicate a nuanced picture. Transformer models, especially PatchTST and the Temporal Fusion Transformer, deliver meaningful improvements on long-horizon and multivariate problems with strong cross-series patterns, where attention mechanisms can exploit long-range dependencies that classical recursive models struggle to encode. On short horizons, on small datasets, and on series dominated by clear seasonality and trend, classical methods remain competitive and, in several cases, superior, especially when evaluated with calibrated probabilistic metrics rather than point error alone. The recently reported finding that simple linear baselines can match or exceed several transformer variants is partially reproduced. The paper argues that the question is not which family of models is better in the abstract, but under which conditions each approach is preferable. Implications for stock market prediction, energy and weather forecasting, demand planning, and IoT-driven analytics are discussed, along with practical recommendations for model selection and a set of open research questions.

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo