Key points are not available for this paper at this time.
High-quality data are foundational to reliable environmental monitoring and urban planning in smart cities, yet challenges like missing values and outliers in air pollution and meteorological time series data are critical barriers. This study developed and validated a dual-phase framework to improve data quality using a 60-month gas and weather dataset from Jubail Industrial City, Saudi Arabia, an industrial region. First, outliers were identified via statistical methods like Interquartile Range and Z-Score. Machine learning algorithms like Isolation Forest and Local Outlier Factor were also used, chosen for their robustness to non-normal data distributions, significantly improving subsequent imputation accuracy. Second, missing values in both single and sequential gaps were imputed using linear interpolation, Piecewise Cubic Hermite Interpolating Polynomial (PCHIP), and Akima interpolation. Linear interpolation excelled for short gaps (R2 up to 0.97), and PCHIP and Akima minimized errors in sequential gaps (R2 up to 0.95, lowest MSE). By aligning methods with gap characteristics, the framework handles real-world data complexities, significantly improving time series consistency and reliability. This work demonstrates a significant improvement in data reliability, offering a replicable model for smart cities worldwide.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ali Suliman AlSalehy
Oregon State University
Mike Bailey
Oregon State University
Smart Cities
Oregon State University
Jubail Industrial College
Building similarity graph...
Analyzing shared references across papers
Loading...
AlSalehy et al. (Wed,) studied this question.
synapsesocial.com/papers/6a12b24bf7bd4f5c7da6bfab — DOI: https://doi.org/10.3390/smartcities8030082