In this section we use two data sets:

- DWD mean annual rainfall data for Germany (
`dwd_sp`

) and

- SRTM DEM of Germany (
`srtm_germany_masked_ETRS89`

)

First, we load these data sets into memory:

```
### LOAD DWD DATA ###
load(url("https://userpage.fu-berlin.de/soga/data/r-data/dwd_sp.RData"))
####################################
### LOAD RASTER DATA ###
## build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "srtm_germany_ETRS89.zip"
## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file
unzip(temp)
unlink(temp)
## read in the data
library(raster)
srtm_germany_masked_ETRS89 <- raster("srtm_germany_ETRS89.tif")
####################################
```

```
plot(srtm_germany_masked_ETRS89,
main = "SRTM DEM and DWD weather stations"
)
plot(dwd_sp,
add = TRUE,
pch = 16, cex = 0.2
)
```

Before we proceed we make sure that all of our data shares the same coordinate reference system.

```
library(rgdal)
newProj <- proj4string(srtm_germany_masked_ETRS89)
# re-project spatial data sets
dwd_sp <- spTransform(dwd_sp,
CRSobj = newProj
)
# Retrieve Federal States by the the getData() function from the raster package
germany <- getData(country = "Germany", level = 1)
germany <- spTransform(germany,
CRSobj = newProj
)
```

Further, we cast the `RasterLayer`

object
`srtm_germany_masked_ETRS89`

to a
`SpatialPixelsDataFrame`

object and rename the column in the
attribute table to `Altitude`

.

```
# cast to SpatialPixelsDataFrame
srtm_germany_masked_ETRS89_spdf <- as(
srtm_germany_masked_ETRS89,
"SpatialPixelsDataFrame"
)
```

`colnames(srtm_germany_masked_ETRS89_spdf@data) <- "Altitude"`

We can explore the spatial correlation of our predictor variable by
plotting a *variogram cloud*. The variogram cloud is obtained by
plotting all possible squared differences of observation pairs \((Z(s_i)−Z(s_j))^2\) against their
separation distance \(h_{ij}\). As any
point in the variogram cloud refers to a pair of points in the data set,
the variogram cloud is used to point us to areas with unusually high or
low variability (Bivand et al. 2008).

```
library(gstat)
ev <- variogram(mean_annual_rainfall ~ 1, dwd_sp, cloud = TRUE)
plot(ev)
```

The sample variogram plot below is nothing but a plot of averages of
semivariogram cloud values over distance intervals \(h\). The `variogram()`

is an
extremely versatile function and accepts a ton of additional arguments.
For the purpose of this tutorial we keep the defaults. Type
`?variogram`

into your console to review a detailed
description of the `variogram()`

function.

```
ev <- variogram(mean_annual_rainfall ~ 1, dwd_sp)
plot(ev)
```

There are different approaches to select a variogram model:

- Visual and interactive fitting

The `eyefit()`

function from the `geoR`

package offers the possibility to
fit a function to the empirical variogram by interactively varying the
function type and its parameters. Here, we do not cover this approach,
however the interested reader may visit the online documentation for
more details (here).

- Setting the function and its parameters by hand

The `gstat`

package ships with the `vgm()`

function, which allows us to specify a wide range of potential variogram
models and specific parameters. A list of available models is printed by
using the `vgm()`

command.

`vgm()`

```
## short long
## 1 Nug Nug (nugget)
## 2 Exp Exp (exponential)
## 3 Sph Sph (spherical)
## 4 Gau Gau (gaussian)
## 5 Exc Exclass (Exponential class/stable)
## 6 Mat Mat (Matern)
## 7 Ste Mat (Matern, M. Stein's parameterization)
## 8 Cir Cir (circular)
## 9 Lin Lin (linear)
## 10 Bes Bes (bessel)
## 11 Pen Pen (pentaspherical)
## 12 Per Per (periodic)
## 13 Wav Wav (wave)
## 14 Hol Hol (hole)
## 15 Log Log (logarithmic)
## 16 Pow Pow (power)
## 17 Spl Spl (spline)
## 18 Leg Leg (Legendre)
## 19 Err Err (Measurement error)
## 20 Int Int (Intercept)
```

The `vgm()`

function is specified by a range of additional
arguments such as `psill`

, which corresponds to the (partial)
sill of the variogram model, `model`

, which specifies the
model type, e.g. “Exp”, “Sph”, “Gau”, “Mat” (see above),
`range`

, which specifies the range parameter of the variogram
model, `kappa`

, which is a smoothness parameter for the
`Matern`

class of variogram models, `nugget`

,
which adds the nugget component to the model, `anis`

, a set
of parameters to account for anisotropy, among others. Depending on the
data set and on the model of choice we have to specify initial values
for the different parameters. If we choose initial values too far off
from reasonable values, the fit will not succeed.

If we consider the sample variogram from above, an exponential model
appears to be a good fit. By eyeballing the sample variogram we set
`psill = 30000`

, `model = "Exp"`

,
`range = 100000`

, and `nugget= 25000`

.

```
vm <- vgm(psill = 30000, model = "Exp", range = 100000, nugget = 25000)
vm
```

```
## model psill range
## 1 Nug 25000 0e+00
## 2 Exp 30000 1e+05
```

To verify our parameter choice we plot the sample variogram and overlay it with the variogram model as specified above.

`plot(ev, vm)`

The variogram model looks reasonable, but for sure not perfect.

- Selection of the function and parameters based on a goodness-of-fit criterion

We may select the variogram model and its parameters statically, by
minimizing some goodness-of-fit criterion. For example the
`fit.variogram()`

function from the `gstat`

package allows us to find an appropriate variogram model.

The `fit.variogram()`

function need as arguments at least
a sample variogram (`object`

), e.g. the output of the
`variogram()`

function, and a `model`

, which
specifies the model type, e.g. “Exp”, “Sph”, “Gau”, “Mat”, among others
(see output of `vgm()`

). The function
`fit.variogram()`

returns an object of class
`variogramModel`

extending `data.frame`

. The
`data.frame`

object has a logical attribute
`singular`

, which indicates whether the non-linear fit
converged (`FALSE`

).

Let us apply the `fit.variogram()`

to obtain the
parameters of an exponential variogram model.

```
vm_exp <- fit.variogram(ev, vgm("Exp"))
vm_exp
```

```
## model psill range
## 1 Nug 26296.91 0.0
## 2 Exp 31955.10 113131.6
```

OK, compare the result with our eyeballed parametrization. We have not been too much off.

Let us check whether the fit converges.

`attr(vm_exp, "singular")`

`## [1] FALSE`

Perfect, recall that `FALSE`

indicates model
convergence.

Note that we may specify the fitting method, used by
`gstat`

by setting the additional arguments
`fit.method`

. There are four choices implemented:

\[ \begin{array}{c|c} \mathtt{fit.method} & \text{Weight} \\ \hline 1 & N_j \\ 2 & N_j\{\gamma(h_j)\}^2 \\ 6 & 1 \\ 7 & N_j/h_j^2 \end{array} \]

The default value chosen by `gstat`

is
`fit.method = 7`

.

Exercise:Apply the`fit.variogram()`

fuction to obtain the parameters for the spherical and the Gaussian variogram model and check each of them for convergence.

```
## Your code here...
vm_sph <- NULL
vm_gau <- NULL
```

```
vm_sph <- fit.variogram(ev, vgm("Sph"))
attr(vm_exp, "singular")
```

`## [1] FALSE`

```
vm_gau <- fit.variogram(ev, vgm("Gau"))
attr(vm_exp, "singular")
```

`## [1] FALSE`

Nice! Both models converge.

Now, let us visualize our three different variogram models we built above.

`plot(ev, vm_exp, main = "Exponential variogram model")`

Exercise:Visualize the spherical and the Gaussian variogram model. Which of the three models seems to be the best fit by visual inspection?

`## Your code here...`

`plot(ev, vm_sph, main = "Spherical variogram model")`

`plot(ev, vm_gau, main = "Gaussian variogram model")`

Based on the visual inspection, the exponential variogram model
(`vm_exp`

) appears to be the best fit.

In addition to the `fit.variogram()`

function the
`automap`

package provides the
`autofitVariogram()`

. The function accepts a formula
(`formula`

) and a data argument (`input_data`

).
The following expression fits a variogram model to our data:

```
library(automap)
vm_auto <- autofitVariogram(
formula = mean_annual_rainfall ~ 1,
input_data = dwd_sp
)
plot(vm_auto)
```

In the plot above the points show the sample empirical variogram. The
numbers indicate the number of pairs assigned to the bin, and the
continuous line is the fitted variogram model. In this case, the
selected model was *Matérn, M. Stein’s parameterization* (code
name “Ste”), with the four parameter values stated in the bottom-right
corner of the image.

The function returns an object of type `autofitVariogram`

.
This object contains the fitted variogram model, which can be accessed
with the `$`

operator.

`vm_auto$var_model`

```
## model psill range kappa
## 1 Nug 15442.86 0.0 0.0
## 2 Ste 50425.68 283865.8 0.2
```

So far we assumed that the spatial correlation structure is the same
in all directions (isotropic). Hence, the covariance function depends
only on the magnitude of the lag vector \(h\) and not on the direction. In many
cases, however, the autocorrelation shows a structure that is different
in different directions. This property of a spatial process is referred
to as **anisotropy**.

The most commonly employed model for anisotropy is *geometric
anisotropy*, with the semivariogram reaching the same sill in all
directions, but at different ranges. To check for directional
dependence, we use directional empirical semivariograms, which include
only those values for data pairs falling within certain directional
bands as well as falling within the prescribed lag limits.

With the `gstat`

package we can account for anisotropy by
specifying directional bands by an azimuthal direction. The
`variogram()`

function accepts an `alpha`

parameter, which indicates the direction in plane in positive degrees
clockwise. For example, we may specify the main directions to be \(0^\circ\), \(45^\circ\), \(90^\circ\), and \(135^\circ\) with angular tolerance of \(22.5^\circ\).

```
alpha <- (0:3) * 45
alpha
```

`## [1] 0 45 90 135`

If we plug the additional argument `alpha`

into the
`variogram()`

function call and plot it, we see a panel plot,
where each plot corresponds to a certain directional band. Note that
directional semivariograms are often noisier due to the reduced number
of data pairs used for the estimation.

```
ev_anis <- variogram(mean_annual_rainfall ~ 1, dwd_sp, alpha = alpha)
plot(ev_anis)
```

The plot does not show overwhelming evidence of anisotropy. Let us
add our exponential variogram model we fitted in the previous section
(`vm_exp`

) to see if it fits the data in all four directional
bands.

`plot(ev_anis, vm_exp)`