Abstract
Background To write.
The solution to certain stochastic partial differential equations (SPDEs) is a Gaussian field with Matérn covariance (Whittle 1954). Lindgren, Lindström, and Rue (2010) propose to represent a Gaussian field with Matérn covariance as a GMRF by representing the solution of the SPDE using the finite element method. The Gaussian field \(f(x)\) with the Matérn covariance function is a solution to the linear fractional (since \(\alpha \neq 2\)) SPDE \[ (\kappa^2 - \Delta)^{\alpha/2} f(x) = \epsilon(x), \quad x \in \mathbb{R}^d, \quad \alpha = \nu + d/2, \quad \kappa > 0, \quad \nu > 0, \] where \((\kappa^2 - \Delta)^{\alpha/2}\) is a pseduo-differential operator, the innovation process \(\epsilon\) is spatial Gaussian white noise with unit variance and \(\Delta\) is the Laplacian \[ \Delta = \sum_{i=1}^d \frac{\delta^2}{\delta x_i^2}, \] and the marginal variance is \[ \sigma^2 = \frac{\Gamma(\nu)}{\Gamma(\nu + d/2)(4\pi)^{d/2}\kappa^{2\nu}} \]
SPDE equation:
\[Df = \epsilon\]
where \(D\) is a differential operator, \(f\) is a funciton and \(\epsilon\) is a stochastic process (often white noise). This corresponds to the integral form:
\[Df = \epsilon \iff \int Df(x) \phi(x) \text{d}x = \int \epsilon(x) \phi(x) \text{d}x \, \forall \phi\]
where \(\phi\) is known as a test function. Using the inner product notation \(\langle f, g \rangle = \int f(x) g(x) \text{d}x\) this is equivalent to:
\[\langle Df, \phi \rangle = \langle \epsilon, \phi \rangle\]
Solve for points on a mesh \(j = 1, \ldots, M\) and \(f(x) = \sum_{j = 1}^M \beta_j \psi_j (x)\) then:
\[ \langle D \sum_{j = 1}^M \beta_j \psi_j, \phi \rangle = \sum_{j = 1}^M \beta_j \langle D \psi_j, \phi \rangle = \langle \epsilon, \phi \rangle \]
Test only functions \(\phi\) of the form \(\sum_{j = 1}^M \beta_j \psi_j (x)\), equivalent to (showing it’s equal for all of the basis shows it’s equal for any linear combination of them) showing that:
\[ \sum_{j = 1}^M \beta_j \langle D \psi_j, \psi_i \rangle = \langle \epsilon, \psi_i \rangle \, \forall \psi_i \]
This can be written in matrix form as \(\mathbf{P} \mathbf{\beta} = \mathbf{e}\) where \(P_{i, j}\) is \(\langle D \psi_i, \psi_j \rangle\) and \(e_j = \langle \epsilon, \psi_j \rangle\). The random vector \(\mathbf{e} \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}_e ^{-1})\) where the precision has \((i, j)\)th entry \(\langle \psi_i, \psi_j \rangle\). If \(\mathbf{P} \mathbf{\beta} \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}_e ^{-1})\) then \(\mathbf{\beta} \sim \mathcal{N}(\mathbf{0}, \mathbf{Q})\) where \(\mathbf{Q}\) is such that \(\mathbf{Q}^{-1} = \mathbf{P} \mathbf{Q}_e^{-1} \mathbf{P}^\top \implies \mathbf{Q} = \mathbf{P}^\top \mathbf{Q}_e \mathbf{P}\) (does \(\mathbf{P}^\top \mathbf{P} = I\)?).
The steps are to use finite element method to compute \(\mathbf{Q}\), then simulate \(\tilde \beta\) from the Gaussian, then you have a sample \(\tilde f\) from the stochastic process which is a solution to the SPDE.
Need to choose a mesh, grid, triangulation to do the FEM.
R-INLA
uses regular grid in 1D, or constrained Delaunay
triangulation in 2D to produce a mesh. Piecewise linear basis functions
(linear B-spline).
Assume \(f\) to have the form \(f(x) = \sum_{j = 1}^M \beta_j \psi_j\). Observations give a log-likelihood of \(\ell(\beta)\). Point estimate \(\hat \beta\) leads to function \(\hat f\). Smoothing penalty \(J(\beta, \lambda)\) to give penalised log-likelihood \(\ell_p(\beta, \lambda) = \ell(\beta) - J(\beta, \lambda)\). Estimate both \(\lambda\) and \(\beta\) using REML.
Lots of choices for the penalty!
You can interpret penalised likelihood as just Bayesian. This gets you to the following (improper) prior:
\[ p(\beta) \propto \text{exp}( - \lambda \beta^\top S \beta) \]
This is just a Gaussian with precision \(\lambda S\).
Approximate precision matrix for SPDE \(Df = \epsilon\) is the same as \(\lambda S\) computed using the smoothing penalty \(\langle Df, Df, \rangle\).
What smoothing penalty does the Matérn correspond to? This one:
\[\tau \int (\kappa^2 f - \Delta f)^2 \text{d} x\]
Trade off between the value of \(f\) and the second derivative in each direction. Higher \(\kappa\) means that \(\Delta f\) can be higher without incurring penalty. The other parameter \(\tau\) controls the overall smoothness. Similar question about how distinuishable these are as with lengthscale and noise (probably not very).
library(INLA)
Generate fake data:
n <- 200
coo <- matrix(runif(2 * n), n)
plot(coo[,1], coo[,2])
s2u <- 0.5 # "sigma squared u"
k <- 10
r <- 2/k
R <- s2u * exp(-k * as.matrix(dist(coo)))
image(R)
Sample one multivariate Gaussian observation (note that
rnorm(n) %*% chol(R)
is equivalent to
MASS::mvrnorm
):
u <- drop(rnorm(n) %*% chol(R))
u2 <- MASS::mvrnorm(1, rep(0, n), Sigma = R)
plot(u)
points(u2, add = TRUE, col = "red")
## Warning in plot.xy(xy.coords(x, y), type = type, ...): "add" is not a graphical parameter
Add a covariate effect and noise (nugget) terms:
x <- runif(n)
beta <- 1:2
s2e <- 0.2 # "sigma squared e"
lin.pred <- beta[1] + beta[2] * x + u # "linear predictor"
e <- rnorm(n, 0, sqrt(s2e)) # "noise"
y <- lin.pred + e
Mesh:
mesh <- inla.mesh.2d(
coo,
cutoff = r/10,
max.edge = c(r/4, r/2),
offset = c(r/2, r)
)
plot(mesh, asp = 1)
points(coo, col = "red")
Question as to if all of the points lie at verticies? It doesn’t look like it, but most of them are.
Create the projection matrix A
:
A <- inla.spde.make.A(mesh = mesh, loc = coo)
class(A)
## [1] "dgCMatrix"
## attr(,"package")
## [1] "Matrix"
A
is a \(200 \times
1740\) sparse Matrix. Where do the dimensions come from? 200 is
the number of data points, what about the 1740? I’d guess the number of
mesh points but it’s not exactly right:
mesh$n
## [1] 1737
Set-up the SPDE:
spde <- inla.spde2.pcmatern(
mesh = mesh,
alpha = 1.5,
prior.range = c(0.2, 0.5), # P(range < 0.2) = 0.5
prior.sigma = c(1, 0.5) # P(sigma > 1) = 0.5
)
stk.e <- inla.stack(
tag = "est",
data = list(y = y),
A = list(1, A),
effects = list(
data.frame(b0 = 1, x = x),
idx.u = 1:spde$n.spde)
)
pcprec <- list(
hyper = list(theta = list(prior = "pc.prec", param = c(1, 0.1)))
)
mf <- y ~ 0 + b0 + x + f(idx.u, model = spde)
res <- inla(
mf,
control.family = pcprec,
data = inla.stack.data(stk.e),
control.predictor = list(compute = TRUE, A = inla.stack.A(stk.e))
)
round(res$summary.fixed, 4)
pmd.s2e <- inla.tmarginal(
function(x) sqrt(1/x),
res$marginals.hyperpar$`Precision for the Gaussian observations`
)
plot(pmd.s2e, type = "l", ylab = "Density", xlab = expression(sigma[e]))
Consider a regular 2D lattice with number of sites tending to infinity then \[ \mathbb{E}(u_{i, j} \, | \, u_{-i, j}) = \frac{1}{a}(u_{i-1, j} + u_{i + 1, j} + u_{i, j - 1} + u_{i, j + 1}) \] and \(\mathbb{V}(u_{i, j} \, | \, u_{-i, j}) = 1/a\) with \(|a| > 4\). For \(\nu = 1\) and \(\nu = 2\) the GMRF representation is a convolution of processes with… (missing diagram here but it’s a Besag precision). As the smoothness \(\nu\) increases the precision matrix in GMRF representation becomes less sparse, the conditional distributions depend on a wider neighbourhood.
q1 <- INLA:::inla.rw1(n = 5)
q1
## 5 x 5 sparse Matrix of class "dgTMatrix"
##
## [1,] 1 -1 . . .
## [2,] -1 2 -1 . .
## [3,] . -1 2 -1 .
## [4,] . . -1 2 -1
## [5,] . . . -1 1
crossprod(q1)
## 5 x 5 sparse Matrix of class "dsCMatrix"
##
## [1,] 2 -3 1 . .
## [2,] -3 6 -4 1 .
## [3,] 1 -4 6 -4 1
## [4,] . 1 -4 6 -3
## [5,] . . 1 -3 2
q2 <- INLA:::inla.rw2(n = 5)
q2
## 5 x 5 sparse Matrix of class "dgTMatrix"
##
## [1,] 1 -2 1 . .
## [2,] -2 5 -4 1 .
## [3,] 1 -4 6 -4 1
## [4,] . 1 -4 5 -2
## [5,] . . 1 -2 1
\[ u(s) = \sum_{k = 1}^m \psi_k(s) w_k \] where \(\psi_k\) are basis functions and \(w_k\) are Gaussian weights, \(m\) the number of points in the triangulation. The distribution of weights determines the full distribution in the continuous domain.
Projector matrix \(A\) works using barycentric coordinates of the point with respect to the coordinates of the triangle vertices.
data(SPDEtoy)
str(SPDEtoy)
## 'data.frame': 200 obs. of 3 variables:
## $ s1: num 0.0827 0.6123 0.162 0.7526 0.851 ...
## $ s2: num 0.0564 0.9168 0.357 0.2576 0.1541 ...
## $ y : num 11.52 5.28 6.9 13.18 14.6 ...
plot(x = SPDEtoy$s1, y = SPDEtoy$s2, cex = SPDEtoy$y / 10)
pl.dom <- cbind(c(0, 1, 1, 0.7, 0), c(0, 0, 0.7, 1, 1))
mesh5 <- inla.mesh.2d(loc.domain = pl.dom, max.e = c(0.092, 0.2))
plot(mesh5)
spde5 <- inla.spde2.pcmatern(
mesh = mesh5, alpha = 2, # Mesh and smoothness parameter
prior.range = c(0.3, 0.5), # P(practic.range < 0.3) = 0.5
prior.sigma = c(10, 0.01) # P(sigma > 10) = 0.01
)
Projector matrix is sparse (maximum three elements in each row are non-zero, and each row sums to one).
coords <- as.matrix(SPDEtoy[, 1:2])
A5 <- inla.spde.make.A(mesh5, loc = coords)
dim(A5)
## [1] 200 489
table(rowSums(A5 > 0)) # Three non-zero elements
##
## 3
## 200
table(rowSums(A5)) # Which add to one
##
## 1
## 200
table(colSums(A5) > 0) # Columns which correspond to triangles with no points inside can be dropped
##
## FALSE TRUE
## 237 252
# This is done automatically by inla.stack()
inla.stack()
organises the data, covariates, indicies
and projector matrices
stk5 <- inla.stack(
data = list(resp = SPDEtoy$y),
A = list(A5, 1),
effects = list(i = 1:spde5$n.spde, beta0 = rep(1, nrow(SPDEtoy))),
tag = 'est'
)
dim(inla.stack.A(stk5)) # Number of columns in A5 which have non-zero entries, plus one column for the intercept
## [1] 200 253
res5 <- inla(
resp ~ 0 + beta0 + f(i, model = spde5),
data = inla.stack.data(stk5),
control.predictor = list(A = inla.stack.A(stk5))
)
res5$summary.fixed
Now for predicting at new points.
pts3 <- rbind(c(0.1, 0.1), c(0.5, 0.55), c(0.7, 0.9))
A5pts3 <- inla.spde.make.A(mesh5, loc = pts3)
dim(A5pts3)
## [1] 3 489
jj3 <- which(colSums(A5pts3) > 0)
A5pts3[, jj3]
## 3 x 9 sparse Matrix of class "dgCMatrix"
##
## [1,] . . 0.05440876 . . . 0.5020259 0.4435654 .
## [2,] 0.2187568 . . 0.3236197 . 0.4576234 . . .
## [3,] . 0.1194153 . . 0.2682321 . . . 0.6123526
Interpolation of the posterior mean:
drop(A5pts3 %*% res5$summary.random$i$mean)
## [1] 0.3112111 3.1013403 -2.7231595
Projection onto a grid:
pgrid0 <- inla.mesh.projector(mesh5, xlim = 0:1, ylim = 0:1, dims = c(101, 101))
prd0.m <- inla.mesh.project(pgrid0, res5$summary.random$i$mean)
prd0.s <- inla.mesh.project(pgrid0, res5$summary.random$i$sd)
image(prd0.m)
image(prd0.s)
Prediction of new data:
# Create a new stack
stk5.pmu <- inla.stack(
data = list(resp = NA),
A = list(A5pts3, 1),
effects = list(i = 1:spde5$n.spde, beta0 = rep(1, 3)),
tag = 'prd5.mu'
)
# Append it
stk5.full <- inla.stack(stk5, stk5.pmu)
r5pmu <- inla(
resp ~ 0 + beta0 + f(i, model = spde5),
data = inla.stack.data(stk5.full),
control.mode = list(theta = res5$mode$theta, restart = FALSE),
control.predictor = list(A = inla.stack.A(stk5.full), compute = TRUE)
)
indd3r <- inla.stack.index(stk5.full, 'prd5.mu')$data
indd3r
## [1] 201 202 203
r5pmu$summary.fitted.values[indd3r, c(1:3, 5)]
sessionInfo()
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Monterey 12.6
##
## Matrix products: default
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] INLA_22.05.07 foreach_1.5.2 Matrix_1.5-1 knitr_1.39 rgdal_1.5-30
## [6] raster_3.5-15 sp_1.4-7 rstan_2.21.5 StanHeaders_2.21.0-7 sf_1.0-7
## [11] bsae_0.2.7 multi.utils_0.1.0 stringr_1.4.0 purrr_0.3.4 readr_2.1.2
## [16] tidyr_1.2.0 tibble_3.1.7 tidyverse_1.3.1 forcats_0.5.1 ggplot2_3.3.6
## [21] dplyr_1.0.9 rmarkdown_2.14
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-3 ellipsis_0.3.2 class_7.3-20 rprojroot_2.0.3 fs_1.5.2
## [6] rstudioapi_0.13 proxy_0.4-26 Deriv_4.1.3 MatrixModels_0.5-0 bit64_4.0.5
## [11] fansi_1.0.3 lubridate_1.8.0 xml2_1.3.3 codetools_0.2-18 splines_4.2.0
## [16] cachem_1.0.6 jsonlite_1.8.0 broom_0.8.0 dbplyr_2.1.1 compiler_4.2.0
## [21] httr_1.4.3 backports_1.4.1 assertthat_0.2.1 fastmap_1.1.0 cli_3.3.0
## [26] htmltools_0.5.2 prettyunits_1.1.1 tools_4.2.0 gtable_0.3.0 glue_1.6.2
## [31] Rcpp_1.0.8.3 cellranger_1.1.0 jquerylib_0.1.4 vctrs_0.4.1 iterators_1.0.14
## [36] xfun_0.31 ps_1.7.0 rvest_1.0.2 lifecycle_1.0.1 orderly_1.4.3
## [41] terra_1.5-21 ids_1.0.1 MASS_7.3-56 scales_1.2.0 hms_1.1.1
## [46] inline_0.3.19 rticles_0.23.6 yaml_2.3.5 memoise_2.0.1 gridExtra_2.3
## [51] loo_2.5.1 sass_0.4.1 stringi_1.7.6 RSQLite_2.2.14 highr_0.9
## [56] e1071_1.7-9 pkgbuild_1.3.1 rlang_1.0.2 pkgconfig_2.0.3 matrixStats_0.62.0
## [61] evaluate_0.15 lattice_0.20-45 bit_4.0.4 processx_3.5.3 tidyselect_1.1.2
## [66] magrittr_2.0.3 R6_2.5.1 generics_0.1.2 DBI_1.1.2 pillar_1.7.0
## [71] haven_2.5.0 withr_2.5.0 units_0.8-0 modelr_0.1.8 crayon_1.5.1
## [76] uuid_1.1-0 KernSmooth_2.23-20 utf8_1.2.2 tzdb_0.3.0 viridis_0.6.2
## [81] grid_4.2.0 readxl_1.4.0 blob_1.2.3 callr_3.7.0 reprex_2.0.1
## [86] digest_0.6.29 classInt_0.4-3 openssl_2.0.1 RcppParallel_5.1.5 stats4_4.2.0
## [91] munsell_0.5.0 viridisLite_0.4.0 bslib_0.3.1 askpass_1.1