What question did this study set out to answer?

This research investigates whether large pre-trained models can outperform smaller, traditional models in time series forecasting tasks.

January 14, 2026Open Access

Scaling transformers for time series forecasting: do pretrained large models outperform small-scale alternatives?

Puntos clave

This research investigates whether large pre-trained models can outperform smaller, traditional models in time series forecasting tasks.
Empirical comparison of large-scale pre-trained models and small-scale transformers
Evaluation of forecasting accuracy and computational efficiency across multiple benchmarks
Ablation study examining different fine-tuning dataset sizes (10%, 25%, 75%)
Analysis of model explainability using feature ablation and integrated gradients
Theoretical and quantitative analysis of computational complexity, including parameter counts and training time.
Pre-trained large models demonstrated superior performance in certain forecasting scenarios compared to small-scale transformers
Small models remained competitive, particularly in low-data situations
Few-shot and moderate adaptation advantages were observed with large models under specific conditions
Comprehensive analysis revealed unique strengths and limitations of both model sizes.

Resumen

Abstract Large pre-trained models have demonstrated remarkable capabilities across domains, but their comparative effectiveness in time series forecasting, especially against smaller, efficient models, remains underexplored. This work empirically examines whether pre-trained large-scale time series models (LSTSMs) trained on diverse datasets can outperform traditional non-pretrained small-scale transformers in forecasting tasks. We specifically compare large models trained from scratch against those benefiting from pretraining to measure the direct impact of transfer learning on forecasting performance. We analyze state-of-the-art (SOTA) pre-trained universal time series models (e.g., Moirai, GPT4TS, Timer, CALF, LLM4TS) alongside conventional small-scale transformers, evaluating accuracy and computational efficiency across multiple benchmarks. We further conduct an extensive ablation study across varying fine-tuning data sizes (10%, 25%, and 75%) to assess few-shot, moderate, and near full-data adaptation capabilities. Additionally, explainability of large time series models is examined using comprehensiveness via feature ablation, occlusion, integrated gradients and gradient shap methods. Besides that, interpretability of pretraining and finetuning strategies is also examined using spectral metrics via WeightWatcher to quantify layer-wise generalization and representation quality, while theoretical and quantitative computational complexity analyses, including parameter counts, training time, model sizes, and inference latency, highlight the trade-offs between predictive performance and resource efficiency. Our findings reveal the strengths and limitations of pre-trained large-scale models, providing insights into their suitability for time series tasks compared to task-specific small-scale architectures. The results highlight scenarios where pretraining offers advantages and where simpler models remain competitive.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo