Time series forecasting poses a significant challenge across diverse domains such as finance and climate science, yet existing deep learning models often emphasize sequential dependencies while neglecting causal structures, leading to spurious correlations and reduced interpretability. To address these limitations, a novel time series forecasting framework, ARCausal, that explicitly integrates Transfer entropy–based causal discovery with Transformer attention modelin is proposed. Unlike prior causal inference approaches that rely on static assumptions or expensive graph construction, ARCausal introduces a sparse a sparse causal masking mechanismderived from transfer entropy and refined through vector autoregression (VAR)-guided estimation to efficiently capture dynamic causal interactions. This causal mask suppresses non-informative dependencies and enforces clear separation between autocorrelation and true causal effects, enabling both high predictive accuracy and interpretability. Extensive experiments on nine benchmark datasets (including ETT, Weather, and Exchange) demonstrate that ARCausal achieves pleasant performance, reducing the overall MSE by 4% compared to baseline methods while maintaining computational efficiency. Visualization analyses further confirm that our causal masks yield interpretable structures. The implementation is publicly available at https://github.com/jancely/ARCausal .
Zhou et al. (Fri,) studied this question.