Photovoltaic (PV) forecasting across multiple temporal horizons is essential for grid operation, yet forecasts generated independently at different resolutions lack temporal coherence. This study examines whether temporal hierarchical reconciliation improves PV forecasts across heterogeneous modeling paradigms under a unified experimental protocol. We analyze Belgian PV generation series using representative models from four families: statistical (TBATS), machine learning (LightGBM), deep learning (KAN, NHITS, NBEATSx), and foundation models (TimeGPT). Weekly, daily, and hourly forecasts are generated independently and reconciled using bottom-up, ordinary least squares, variance-based, and covariance-based approaches. Performance is evaluated through 52-week walk-forward cross-validation covering a full annual cycle. Reconciliation effects are strongly dependent on model family and temporal level. LightGBM is the strongest baseline model and, when combined with cross-covariance reconciliation, achieves the best performance, with average error reductions of approximately 15% across all frequencies, reaching about 40% at the weekly level and 13% at the hourly level. Deep learning and Foundation models benefit primarily from variance-based reconciliation, while simpler estimators often degrade performance. TBATS exhibit limited and method-specific improvements rather than systematic gains. At the daily level, accuracy improvements are limited and model-dependent. The benefits of temporal reconciliation vary across forecasting paradigms and frequencies. • Temporal hierarchical reconciliation yields selective and frequency-dependent improvements in Belgian PV forecasting. • LightGBM is the strongest baseline model and achieves the best global performance after reconciliation, with cross-covariance reconciliation reducing NRMSE by approximately 15%. • Deep learning models improve mainly under variance-based reconciliation, with global error reductions of approximately 17% for KAN, 7% for NBEATSx, and 6% for NHITS. • Statistical and foundation models show modest global improvements of approximately 6% for TBATS and 4% for TimeGPT, without changes in their overall ranking. • Reconciliation tends to reduce the bias factor in most cases but does not eliminate heteroskedasticity, which remains intrinsic to PV generation.
Gonzalez et al. (Sun,) studied this question.