The final stage of a classical geostatistical analysis is to predict the values of $$Z(\cdot)$$ at desired locations in $$D$$. Methods dedicated to this purpose are called kriging. Kriging originated in the mining industry in the early 1950’s (Krige, 1951) and was formalized by the French mathematician Georges Matheron in the 1960’s.

The standard version of kriging is called ordinary kriging (OK).

Here the predictions are based on the model:

$Z(s) = \mu + \epsilon(s)$

where $$\mu$$ is the constant stationary function (global mean) and $$\epsilon(s)$$ is the spatially correlated stochastic part of variation.

$\hat z_{OK}(\mathbf s_0) = \sum_{i=1}^nw_i(\mathbf s_0)\cdot z(\mathbf s_i) = \lambda_0^T \cdot \mathbf z$

where $$\lambda_0$$ is the vector of kriging weights $$(w_i)$$, $$z$$ is the vector of $$n$$ observations at primary locations. In other words, in OK the predicted value is some constant plus a weighted function of neighboring measurements.

The Universal Kriging (UK), also referred to as Kriging with External Drift or Regression Kriging, is a more general model of which OK is a special case. While both methods involve spatial autocorrelation modeling, they differ in the definition of the trend. In OK, the trend is a constant value, while the trend is a linear function of one or more covariates in UK.

\begin{align} \hat z_{UK}(\mathbf s_0) & = \hat \mu(\mathbf s_0)+\hat \epsilon(\mathbf s_0)\\ & = \sum_{k=0}^p\hat \beta_k \cdot q_k(\mathbf s_0)+\sum_{i=1}^n \lambda_i \cdot e(\mathbf s_i) \end{align}

where $$\hat\mu(\mathbf s_0)$$ is the fitted deterministic part, $$\hat \epsilon(\mathbf s_0)$$ is the interpolated residual, $$\hat \beta_k$$ are estimated deterministic, $$\lambda_i$$ are kriging weights determined by the spatial dependence structure of the residual and where $$e(\mathbf s_i)$$ is the residual at location $$\mathbf s_i$$. In other words, in UK the predicted value is based on covariates plus a weighted function of neighboring measurements.

In matrix notation, regression-kriging is commonly written as

$\hat z_{UK}(\mathbf s_0) =\mathbf q_0^T\cdot \hat \beta + \lambda_0^T \cdot(\mathbf z - \mathbf q \cdot \hat \beta)$

where $$\hat z(s_0)$$ is the predicted value at location $$\mathbf s_0$$, $$\mathbf q_0$$ is the vector of $$p + 1$$ predictors and $$\lambda_0$$ is the vector of $$n$$ kriging weights used to interpolate the residuals.

The regression coefficients $$\hat \beta_k$$ can be estimated from the sample by some fitting method, e.g. ordinary least squares (OLS). Once the deterministic part of variation has been estimated, the residual can be interpolated with kriging and added to the estimated trend.