Abstract Big data offers policy makers a wealth of real-time information, but distinguishing valuable information from noise remains a significant challenge. Using a large set of provincial data from Guangdong, China, and municipal data from Shenzhen, we find that machine learning models consistently outperform benchmark models in out-of-sample forecasts of real GDP growth, particularly in smaller regions with higher economic volatility. Next, we test performance of the models with additional national-level data and found that dimension reduction is critical for the models to utilize additional noisy data. Finally, a model-driven approach can effectively identify critical predictive variables that policy makers can use to monitor economic conditions. However, the set of informative data may shift over time, necessitating periodical reexamination using big data.
Shi et al. (Fri,) studied this question.