Brucellosis poses a persistent threat to livestock health in high-altitude pastoral regions of China, where harsh environments and semi-nomadic grazing increase transmission risk. Existing surveillance systems rely mainly on periodic serological testing and lack effective early warning capability. This study proposes an ensemble learning-based early warning framework integrating veterinary epidemiological indicators with environmental and herd-movement data. A total of 4826 herd-level records collected over five years (2019–2024) were analyzed, with an overall positivity rate of 11.4%. Multi-source data, including serological, clinical, reproductive, vaccination, meteorological, pasture-management, and herd-movement information (from GPS tracking and structured surveys), were integrated through epidemiology-guided feature engineering. To address class imbalance and temporal dynamics, Synthetic Minority Over-sampling Technique (SMOTE) resampling and sliding time-window features were applied. The proposed ensemble model combines Random Forest, XGBoost, and LightGBM using a soft-voting strategy, with logistic regression as a baseline. Results show that the ensemble model outperforms single models, achieving an AUC of 0.86 and a PR-AUC of 0.65. After threshold optimization, sensitivity increased from 0.78 to 0.87. Under field conditions, the system provided herd-level early warnings with an average lead time of approximately 12 days before confirmed outbreaks, demonstrating its feasibility and practical value for proactive brucellosis surveillance in high-altitude pastoral systems.
Xi et al. (Mon,) studied this question.