November 30, 2025Open Access

Compactly‐Supported Nonstationary Kernels for Computing Exact Gaussian Processes on Big Data

Key Points

Massive data set analysis yields improved performance using a novel Bayesian model with nonstationary kernels, outperforming existing methods.
Experimental results demonstrate enhancements in accuracy when applying these kernels to temperature data from over 1 million points.
Assessment of synthetic data showcases favorable performance in the Gaussian process framework with new kernels and high-performance computing support.
Implications suggest that scalable, exact GPs could compete with prevalent machine learning approaches in extensive data applications.

Abstract

ABSTRACT The Gaussian process (GP) is a widely used method for analyzing large‐scale data sets, including spatio‐temporal measurements of nonlinear processes that are now commonplace in the environmental sciences. Traditional implementations of GPs involve stationary kernels (also termed covariance functions) that limit their flexibility, and exact methods for inference that prevent application to data sets with more than about 10,000 points. Modern approaches to address stationarity assumptions generally fail to accommodate large data sets, while all attempts to address scalability focus on approximating the Gaussian likelihood, which can involve subjectivity and lead to inaccuracies. In this work, we explicitly derive an alternative kernel that can discover and encode both sparsity and nonstationarity. We embed the kernel within a fully Bayesian GP model and leverage high‐performance computing resources to enable the analysis of massive data sets. We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods across a variety of synthetic data examples. Furthermore, we conduct space–time prediction based on more than 1 million measurements of daily maximum temperature and verify that our results outperform state‐of‐the‐art methods in the Earth sciences. More broadly, having access to exact GPs that use ultra‐scalable, sparsity‐discovering, nonstationary kernels allows GP methods to truly compete with a wide variety of machine learning methods.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Risser et al. (Tue,) studied this question.

synapsesocial.com/papers/692b94261d383f2b2a37837a https://doi.org/https://doi.org/10.1002/env.70054

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper