An ARIMAX model integrating Baidu Search Index data improved hand, foot, and mouth disease incidence prediction compared to a standard ARIMA model (MAPE 0.96% vs 2.10%).
Observational (n=24,856,589)
Does an ARIMAX model integrating Baidu Search Index improve HFMD incidence prediction compared to an ARIMA model?
Integrating internet search data into an ARIMAX model significantly improves the accuracy of forecasting hand, foot, and mouth disease incidence.
Tasa de eventos absoluta: 0.96% vs 2.1%
Hand, foot and mouth disease (HFMD) constitutes a global public health concern. Internet search data offers advantages including vast data volumes, provision of real-time information, and the potential for earlier infectious disease surveillance. The objective of this study is to utilise Baidu Search Index (BSI) to construct a predictive model for HFMD epidemiological surveillance, thereby enhancing HFMD incidence forecasting and early warning capabilities. HFMD cases reported by the National Health Commission of the People’s Republic of China from January 2011 to March 2025 were collected. Keywords highly correlated with HFMD were identified using Spearman’s rank correlation and cross-correlation analysis, and a comprehensive search index (CSI) for HFMD was constructed. Subsequently, based on the monthly number of HFMD cases and the CSI, autoregressive integrated moving average (ARIMA) and autoregressive integrated moving average with exogenous inputs (ARIMAX) models were established. Finally, the predictive accuracy of the models was evaluated using mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and standardised mean absolute percentage Error (SMAPE). From January 2011 to March 2025 (total of 171 months), China reported a total of 24,856,589 HFMD cases, with an average of 145,360.2 cases per month. The highest incidence of HFMD occurred between May and July each year. After correlation analysis, five keywords highly associated with HFMD were ultimately included, with a potential time lag of 0 months. The CSI of HFMD exhibits a high Spearman rank correlation (r s = 0.937) with monthly reported HFMD cases. The developed ARIMAX(2,0,1)(1,1,1) (12) + CSI(Lag = 0) performs better in terms of fitting and prediction compared to the ARIMA(2,0,1)(1,1,1) (12) . The MAE values for ARIMAX and ARIMA are 42,085.93 and 80,260.93, respectively, the RMSE values are 52,235.39 and 98,444.62, respectively, the MAPE values are 0.96% and 2.10%, respectively, and the SMAPE are 0.86% and 0.82%, respectively. The ARIMAX(2,0,1)(1,1,1) (12) + CSI(Lag = 0) model constructed from extensive internet search data in this study has effectively enhanced the prediction of HFMD incidence. Furthermore, it can serve as an early warning system for HFMD. This research provides valuable support for HFMD surveillance.
Chen et al. (Fri,) conducted a observational in Hand, foot and mouth disease (HFMD) (n=24,856,589). ARIMAX model integrating Baidu Search Index vs. ARIMA model was evaluated on Predictive accuracy (mean absolute percentage error). An ARIMAX model integrating Baidu Search Index data improved hand, foot, and mouth disease incidence prediction compared to a standard ARIMA model (MAPE 0.96% vs 2.10%).
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: