In geostatistics the spatial correlation is analyzed by the variogram instead of a correlogram or covariogram (Bivand et al. 2008). If the random function \(Z(s)\) is intrinsic stationarity, which means it has a constant mean \((E(Z(s)) = \mu)\), then the variogram is defined as

\[2\gamma(\mathbf h) = \mathbf E[(Z(\mathbf s_i)-Z(\mathbf s_i+h))^2],\]

where \(Z(\mathbf s_i)\) is the value of a target variable at some sampled location and \(Z(\mathbf s_i+h))\) is the value of the neighbor at distance \(\mathbf s_i+h\). In other words, the covariance between values of \(Z(s)\) at any two locations depends on only their relative locations or, equivalently, on their spatial separation distance \(h\), also denoted as spatial lag. The quantity \(2\gamma(s)\) is known as the variogram and the function \(\gamma(h)\) is called the semivariogram which is simply written as

\[\gamma(\mathbf h) = \frac{1}{2}\mathbf E[(Z(\mathbf s_i)-Z(\mathbf s_i+h))^2]\text{.}\]

Suppose that there are \(n\) point observations, this yields \(n \cdot (n-1)/2\) pairs for which a semivariance can be calculated (Hengl 2009).

When data locations are irregularly spaced, there is generally little to no replication of lags among the data locations. To achieve quasi-replication of lags, we divide the lag space into lag classes or bins. Then, \(N_h\) is the number of lags that fall into the bin \(h_j\). The more bins that are used, the smaller they are and the better the lags are approximated by \(h_j\), but the fewer the number of observed lags belonging to \(h_j\). One popular rule of thumb is to require \(N_h\) to be at least 30 and to require the length of \(h_j\) to be less than half the maximum lag length among data locations (Gelfand at al. 2010).

If we assume isotropy, which is direction independence of semivariance, the variogram can be estimated from \(N_h\) sample data pairs \((z(s_i), z(s_i + h))\) for a number of distances (or distance intervals) \(\hat h_j\) by

\[\hat \gamma(\hat h_j) = \frac{1}{2N_h}\sum_{i=1}^{N_h}(Z(s_i)-Z(s_i+h))^2, \quad \forall h \in \hat h_j\]

This estimate is called the sample variogram or experimental variogram.


Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.