Accurate prediction, forecasting and interpretability of air pollutant concentrations are important for sustainable environmental management and protecting public health. An integrated artificial intelligence (AI) framework is proposed to predict, forecast and analyse six major air pollutants, such as particulate matter concentrations (PM2.5 and PM10), ground-level ozone (O3), carbon monoxide (CO), nitrogen dioxide (NO2), and sulphur dioxide (SO2), using a combination of ensemble and deep learning models. Five years of hourly air quality and meteorological data are analysed through correlation and Granger causality tests to uncover pollutant interdependencies and driving factors. The results of the Pearson correlation analysis reveal strong positive associations among primary pollutants (PM2.5–PM10, CO–nitrogen oxides NOx and VOCs) and inverse correlations between O3 and NOx (NO and NO2), confirming typical photochemical behaviour. Granger causality analysis further identified NO2 and NO as key causal drivers influencing other pollutants, particularly O3 formation. Among the 23 tested AI models for prediction, XGBoost, Random Forest, and Convolutional Neural Networks (CNNs) achieve the best performance for different pollutants. NO2 prediction using CNNs displays the highest accuracy in testing (R2 = 0.999, RMSE = 0.66 µg/m3), followed by PM2.5 and PM10 with XGBoost (R2 = 0.90 and 0.79 during testing, respectively). The Air Quality Index (AQI) analysis shows that SO2 and PM10 are the dominant contributors to poor air quality episodes, while ozone peaks occur during warm, high-radiation periods. The interpretability analysis based on Shapley Additive exPlanations (SHAP) highlights the key influence of relative humidity, temperature, solar brightness, and NOx species on pollutant concentrations, confirming their meteorological and chemical relevance. Finally, a deep-NARMAX model was applied to forecast the next horizons for the six air pollutants studied. Six formulas were elaborated using input data at times (t, t − 1, t − 2, …, t − n) to forecast a horizon of (t + 1) hours for single-step forecasting. For multi-step forecasting, the forecast is extended iteratively to (t + 2) hours and beyond. A recursive strategy is adopted for this purpose, whereby the forecast at (t + 1) is fed back as an input to generate the forecasts at (t + 2), and so forth. Overall, this integrated framework combines predictive accuracy with physical interpretability, offering a powerful data-driven tool for air quality assessment and policy support. This approach can be extended to real-time applications for sustainable environmental monitoring and decision-making systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Youness El Mghouchi
Mihaela Tinca Udristioiu
Sustainability
University of Craiova
Building similarity graph...
Analyzing shared references across papers
Loading...
Mghouchi et al. (Sun,) studied this question.
www.synapsesocial.com/papers/698434c0f1d9ada3c1fb3509 — DOI: https://doi.org/10.3390/su18031457