Spatial data has distinctive properties that differentiate it from non-spatial data. One prominent characteristic is spatial autocorrelation (SA). When machine learning techniques are applied for spatial data modeling, they require spatially explicit consideration. If these inherent spatial structures are ignored, models may produce biased predictions. However, integrating this property into the model yields additional spatial insight, thereby enhancing learning and improving predictive accuracy. This study examines spatially explicit K-nearest neighbors (SE-KNN) by integrating SA as a spatially explicit property, λ, into the learning algorithm. The innovation of SE-KNN lies in its alignment with the principle of spatial autocorrelation, as KNN’s core learning assumption—that similar observations tend to have similar outcomes—naturally parallels spatial dependence. The proposed SE-KNN is applied to a house price prediction model using house sales data from Franklin County, Ohio to demonstrate a spatially dependent, data-rich, and real-world problem. The results show that SE-KNN achieved the best prediction accuracy compared to mean of absolute error (MAE) of three other machine learning approaches (i.e., standard KNN, linear regression, and artificial neural networks). The proposed method effectively captures the spatial structures in the housing market and leaves only a trace amount of SA in the residuals.
Chen et al. (Wed,) studied this question.