ABSTRACT Air pollution poses a significant threat to both environmental and public health, underscoring the need for automated and accurate air quality monitoring systems. This study utilizes five Machine Learning (ML) algorithms—Support Vector Machine (SVM), K‐Nearest Neighbors (KNN), Gaussian Naïve Bayes (GNB), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) —to classify Air Quality Index (AQI) levels. The analysis is based on a comprehensive dataset comprising 29, 531 records and 16 pollutant‐related features, including City, Date, PM2. 5, PM10, NO, NO 2, NOx, NH 3, CO, SO 2, O 3, Benzene, Toluene, Xylene, AQI, and AQIBucket. Among the models, RF demonstrated superior performance, achieving 99% accuracy, precision, recall, and F1‐score, with low inference time (42 s) and minimal memory usage (17 MB). To enhance both accuracy and interpretability, two feature selection approaches were employed. A T‐test identified statistically significant differences in pollutant concentrations between high and low AQI groups. In parallel, Explainable AI (XAI) methods—Local Interpretable Model‐Agnostic Explanations (LIME) and Shapley Additive Explanations (SHAP) —were applied to interpret the models’ decision‐making processes. These techniques consistently highlighted PM2. 5, PM10, NO 2, and NO x as the most critical features influencing AQI predictions. Model robustness was validated using 10‐fold cross‐validation, while a paired t ‐test confirmed the statistical significance of RF's superior performance. The integrated approach not only achieves high classification accuracy but also provides meaningful insights into the factors driving air pollution, thus supporting more transparent and reliable air quality assessment systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Farida Siddiqi Prity
Iftikhar Arefin
Mirza Raquib
Environmental Quality Management
International Islamic University Chittagong
Kona Medical (United States)
Shanto-Mariam University of Creative Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Prity et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69ec5b2388ba6daa22daca74 — DOI: https://doi.org/10.1002/tqem.70355