Multivariate time series forecasting faces two key challenges: capturing intra-series temporal dependencies and inter-series spatial dependencies. However, heterogeneous cross-scale correlations and noise from unrelated series may obscure temporal patterns if spatial information is introduced too early. To address these issues, we propose the Pre-trained Multi-scale Receptance Weighted Key Value with Graph Convolutional Network (PMSRWKV-GCN), a two-stage framework that first learns clean temporal representations and then effectively exploits spatial structure. Before the self-supervised pre-training stage, we use the Fast Fourier Transform (FFT) to extract dominant periods. Guided by this analysis, we design a multi-scale time-mixing module derived from the Receptance Weighted Key Value (RWKV) model to align receptive fields with salient periodicities. During pre-training, we adopt a channel-independent (CI) strategy to prevent cross-channel interference and learn channel-specific temporal structure. During fine-tuning stage, a multi-scale Graph Convolutional Network (GCN) captures inter-series dependencies through scale-aware aggregation. Experiments on eight real-world datasets demonstrate that PMSRWKV-GCN achieves consistent improvements over representative baseline models. Ablation studies further confirm that CI pre-training strengthens temporal modeling, while the multi-scale GCN is critical for capturing strong spatial correlations. This stage-decoupled design effectively reconciles the trade-off between avoiding interference and leveraging spatial dependencies, yielding accurate and stable predictions.
Hao et al. (Mon,) studied this question.