Air pollution caused by fine particulate matter (PM2.5) poses a serious public health threat in many South Asian megacities where monitoring networks remain limited. Lahore, Pakistan—frequently ranked among the world’s most polluted cities—still lacks reliable short-term PM2.5 forecasting systems. This study develops a performance-weighted ensemble machine learning framework that integrates satellite observations, meteorological reanalysis data, and ground monitoring measurements to improve daily PM2.5 prediction. Eleven predictor variables were processed using a unified Google Earth Engine pipeline, including MODIS aerosol optical depth, Sentinel-5P trace gases (CO, NO2, SO2), and ERA5 meteorological parameters. Four tree-based machine learning algorithms—Random Forest, XGBoost, LightGBM, and CatBoost—were trained using daily observations from 2019 to 2023. Model evaluation using an independent 2024 dataset showed strong predictive capability, with Random Forest achieving R2 = 0.77 (RMSE = 24.75 µg m−3), XGBoost R2 = 0.76 (RMSE = 26.32 µg m−3), CatBoost R2 = 0.73 (RMSE = 30.39 µg m−3), and LightGBM R2 = 0.70 (RMSE = 32.75 µg m−3). To further enhance performance, the best models were combined into a weighted ensemble (RF 0.5, XGBoost 0.3, and CatBoost 0.2), which produced the highest validation accuracy (R2 = 0.77; RMSE = 23.37 µg m−3). Statistical testing using paired t-tests and Diebold–Mariano tests confirmed that the ensemble significantly reduced forecast errors compared with individual models. Feature importance analysis revealed that surface pressure, temperature, CO, and NO2 were the most influential predictors of PM2.5 variability. The proposed framework demonstrates that combining satellite data, reanalysis meteorology, and ground observations through ensemble learning can provide accurate and scalable air quality forecasting for data-limited urban environments.
Building similarity graph...
Analyzing shared references across papers
Loading...
Haseeb et al. (Sat,) studied this question.
synapsesocial.com/papers/69e865d76e0dea528ddea3d6 — DOI: https://doi.org/10.3390/atmos17040411
Muhammad Haseeb
University of the Punjab
Zainab Tahir
University of the Punjab
Syed Amer Mehmood
University of the Punjab
Atmosphere
Centre National de la Recherche Scientifique
Aix-Marseille Université
Institut de Recherche pour le Développement
Building similarity graph...
Analyzing shared references across papers
Loading...