What question did this study set out to answer?

February 26, 2026Open Access

Using machine learning algorithms to study the relationship between meteorological conditions and air quality parameters

Key Points

The research aims to quantify the relationship between meteorological conditions and air quality parameters using machine learning approaches.
Analyzed five years of observational data (2017–2021) on meteorological parameters and air pollutants.
Employed four machine learning algorithms: Neural Networks, Decision Trees, Random Forests, and Gradient Boosting.
Evaluated model performance using metrics such as mean squared error, root mean squared error, and the coefficient of determination.”],
GB model achieved highest predictive accuracy for NO2 with R2 ≈ 0.83, emphasizing humidity and dew point.
Moderate predictive performance for CO with R2 ≈ 0.46, indicating a mix of meteorological and emission effects.
PM10 showed weak correlations with meteorological variables, influenced more by dust events.

Abstract

Air quality degradation poses significant risks to human health and ecosystems, particularly in rapidly urbanizing and industrialized arid regions. Meteorological conditions strongly influence the formation, transport, and dispersion of air pollutants, yet their relationships are highly nonlinear and difficult to quantify using conventional statistical approaches. This study investigates the influence of meteorological parameters on key air pollutants in the Eastern Region of Saudi Arabia using machine learning techniques. Five years of observational data (2017–2021), including temperature, humidity, wind speed, wind direction, dew point, and atmospheric pressure, were analyzed alongside concentrations of nitrogen dioxide (NO2), carbon monoxide (CO), and particulate matter (PM10). Four machine learning algorithms including Neural Networks (NN), Decision Trees (DT), Random Forests (RF), and Gradient Boosting (GB) were evaluated using standard performance metrics; mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2). The results indicate that meteorological parameters exert pollutant-specific influences. The GB model achieved the highest predictive accuracy for NO2 (R2 ≈ 0.83), highlighting the dominant role of humidity, dew point, and seasonal variation. Moderate predictive performance was observed for CO (R2 ≈ 0.46), suggesting a combined influence of meteorology and emission-driven processes. In contrast, PM10 exhibited weak correlations with meteorological variables, reflecting the dominance of episodic dust events and non-meteorological factors in arid environments. These findings demonstrate the effectiveness of ensemble machine learning models in capturing nonlinear meteorological-pollutant relationships. The study provides valuable insights for air quality forecasting and supports data-driven environmental management in arid and semi-arid regions.

Bookmark

View Full Paper

Bookmark

View Full Paper

Using machine learning algorithms to study the relationship between meteorological conditions and air quality parameters

Key Points

Abstract

Cite This Study