In geostatistics the spatial correlation is analyzed by the variogram instead of a correlogram or covariogram (Bivand et al. 2008). If the random function $Z(s)$, is intrinsic stationarity, which means it has a constant mean $(E(Z(s)) = \mu)$, then the variogram is defined as
$$2\gamma(\mathbf h) = \mathbf E[(Z(\mathbf s_i)-Z(\mathbf s_i+h))^2]\, ,$$where $Z(\mathbf s_i)$ is the value of a target variable at some sampled location and $Z(\mathbf s_i+h))$ is the value of the neighbor at distance $\mathbf s_i+h$. In other words, the covariance between values of $Z(s)$ at any two locations depends on only their relative locations or, equivalently, on their spatial separation distance, $h$, also denoted as spatial lag. The quantity $2\gamma(s)$ is known as the variogram and the function $\gamma(h)$ is called the semivariogram which is simply written as
$$\gamma(\mathbf h) = \frac{1}{2}\mathbf E[(Z(\mathbf s_i)-Z(\mathbf s_i+h))^2]\, .$$Suppose that there are $n$ point observations, this yields $n \cdot (n-1)/2$ pairs for which a semivariance can be calculated (Hengl 2007).
When data locations are irregularly spaced, there is generally little to no replication of lags among the data locations. To obtain quasi-replication of lags, we partition the lag space into lag classes or bins. Then, $N_h$ is the number of lags that fall into the bin $h_j$. The more bins that are used, the smaller they are and the better the lags are approximated by $h_j$, but the fewer the number of observed lags belonging to $h_j$. One popular rule of thumb is to require $N_h$ to be at least 30 and to require the length of $h_j$ to be less than half the maximum lag length among data locations (Gelfand at al. 2010).
If we assume isotropy, which is direction independence of semivariance, the variogram can be estimated from $N_h$ sample data pairs $z(s_i), z(s_i + h)$ for a number of distances (or distance intervals) $\hat h_j$ by
$$\hat \gamma(\hat h_j) = \frac{1}{2N_h}\sum_{i=1}^{N_h}(Z(s_i)-Z(s_i+h))^2, \quad \forall h \in \hat h_j\, .$$This estimate is called the sample variogram or experimental variogram.
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.