New Algorithm Makes Complex Spatial Models Lightning Fast with Vecchia Magic
Authors:
(1) Mohamed A. Abba, Department of Statistics, North Carolina State University;
(2) Brian J. Reich, Department of Statistics, North Carolina State University;
(3) Reetam Majumder, Southeast Climate Adaptation Science Center, North Carolina State University;
(4) Brandon Feng, Department of Statistics, North Carolina State University.
Table of Links
Abstract and 1 Introduction
1.1 Methods to handle large spatial datasets
1.2 Review of stochastic gradient methods
2 Matern Gaussian Process Model and its Approximations
2.1 The Vecchia approximation
3 The SG-MCMC Algorithm and 3.1 SG Langevin Dynamics
3.2 Derivation of gradients and Fisher information for SGRLD
4 Simulation Study and 4.1 Data generation
4.2 Competing methods and metrics
4.3 Results
5 Analysis of Global Ocean Temperature Data
6 Discussion, Acknowledgements, and References
Appendix A.1: Computational Details
Appendix A.2: Additional Results
2 Matern Gaussian Process Model and its Approximations
The full log-likelihood then becomes
2.1 The Vecchia approximation
For any set of spatial locations, the joint distribution of Y can be written as a product of univariate conditional distributions, which can then be approximated by a Vecchia approximation (Vecchia, 1988; Stein et al., 2004; Datta et al., 2016; Katzfuss and Guinness, 2021):
Let p(β, θ) be the prior distribution on the regression and covariance parameters. Using (5) we can write the posterior as (ignoring a constant that does not depend on the parameters)
Hence the log-likelihood and log-posterior of the parameters {β, θ} can be written as a sum of conditional normal log-densities, where the conditioning set is at most of size m. The cost of computing the log-posterior in (6) is linear in n and cubic in m.
Using (8), we can construct an unbiased estimate of the gradient of the Vecchia log-posterior based on a minibatch of the data:
3 The SG-MCMC Algorithm
In this section we first review the general SG Langevin dynamics method and then present the proposed algorithm based on the Vecchia approximation.
3.1 SG Langevin Dynamics
In order to assure convergence to the true posterior the step sizes must satisfy
3.2 Derivation of gradients and Fisher information for SGRLD
In order to compute the log-likelihood, we need the following quantities
3.2.1 Mean parameters
The gradient of the minibatch log-likelihood with respect to the mean parameters β is
3.2.2 Covariance parameters