First, we load the required data sets and packages:

load(url("https://userpage.fu-berlin.de/soga/data/r-data/East_Germany.RData"))
library(raster)
library(gstat)

In this section, we apply the inverse distance weighting (IDW) approach to interpolate weather data based on measurements from 71 DWD weather stations located in East Germany. The data set we use (dwd_east_sp) icludes the mean annual rainfall and the mean annual temperature over the period 1981–2010.

plot(dwd_east_sp)
plot(east_germany_states_sp, add = TRUE, border = "red")


Meshgrid for interpolation

In most real-world applications, we do not interpolate one particular point in space, but we compute the values of our predictor variable on a regular grid. In our example, we build a regular grid with a cell size of 10 km covering the area of East Germany.

In order to set the geographical extend of our grid, we take the bounding box of the east.germany.sp spatial data set.

extent_east_germany <- extent(east_germany_states_sp)
extent_east_germany
## class      : Extent 
## xmin       : 137852.8 
## xmax       : 502916.9 
## ymin       : 5561109 
## ymax       : 6060829

Based on the extent of the boundary box, we use the expand.grid() function to generate a data frame with x and y coordinates.

# set the grid cell size in meter
xh <- 10000

grid_east_germany <- expand.grid(
  x = seq(
    from = round(extent_east_germany@xmin),
    to = round(extent_east_germany@xmax),
    by = xh
  ),
  y = seq(
    from = round(extent_east_germany@ymin),
    to = round(extent_east_germany@ymax),
    by = xh
  )
)

head(grid_east_germany, 10)
##         x       y
## 1  137853 5561109
## 2  147853 5561109
## 3  157853 5561109
## 4  167853 5561109
## 5  177853 5561109
## 6  187853 5561109
## 7  197853 5561109
## 8  207853 5561109
## 9  217853 5561109
## 10 227853 5561109
tail(grid_east_germany, 10)
##           x       y
## 1841 407853 6051109
## 1842 417853 6051109
## 1843 427853 6051109
## 1844 437853 6051109
## 1845 447853 6051109
## 1846 457853 6051109
## 1847 467853 6051109
## 1848 477853 6051109
## 1849 487853 6051109
## 1850 497853 6051109

For better understating, we plot the generated grid points.

plot(grid_east_germany, asp = 1)

Since our grid is an object of the class data.frame,

class(grid_east_germany)
## [1] "data.frame"

we convert it into a SpatialPoints object using the coordinates() function.

coordinates(grid_east_germany) <- ~ x + y
class(grid_east_germany)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"

In the next step, we assign a spatial reference system to our grid. For this, we use the proj4string() function. We may either provide a Proj4 format string or simply provide the EPSG identifier. While proj4string() can return the projection string of a spatial object, it also allows us to modify the current spatial reference system. This makes it is possible to extract the Proj4 format string from an object and assign it to another object.

Here, we extract the spatial reference system from the dwd_east_sp and assign it to our grid grid_east_germany by applying the proj4string() function on both objects.

proj4string(grid_east_germany) <- proj4string(dwd_east_sp)
head(grid_east_germany)
## class       : SpatialPoints 
## features    : 1 
## extent      : 137853, 137853, 5561109, 5561109  (xmin, xmax, ymin, ymax)
## crs         : +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs

Then, we apply the gridded() function to convert the SpatialPoints to a SpatialPixels object.

gridded(grid_east_germany) <- TRUE
class(grid_east_germany)
## [1] "SpatialPixels"
## attr(,"package")
## [1] "sp"

Finally, we plot the regular 10 km grid together with the administrative borders of East Germany’s federal states and the locations of the DWD weather stations.

plot(grid_east_germany,
  main = paste("Weather Stations in East Germany\n and Interpolation Grid"),
  col = "grey",
  cex.main = 0.9
)

plot(east_germany_states_sp, add = TRUE, border = "red")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "blue")


Interpolation of temperature data using IDW

In this section, we interpolate the temperature data given in dwd_east_sp using the IDW method. To apply this method, we use the gstat package. This package provides extensive capability for univariate and multivariate spatial and spatio-temporal geostatistical modelling, prediction and simulation. For an overview of the gstat package, see the official introductory document here.

The main function in the gstat package is the equivocally named function gstat(). The function returns an object of class gstat that holds all information to perform spatial interpolation. The gstat() function accepts numerous parameters (type ?gstat into your console for more details). In order to implement the IDW approach with the gstat() function we have to specify a formula (an object of the formula class) that defines the dependent and independent variables. In addition, we have to provide the data corresponding to the measured point data (e.g. a SpatialPointsDataFrame object), a list with the \(\beta\) parameter (set) and optionally, the number of nearest point observations (nmax) that should be used for the model. Since IDW is an intercept-only model (a model without independent variables), the formula in R is onventionally written as \(\sim 1\).

Consequently, we specify our IDW model as follows:

neighbors <- length(dwd_east_sp)
beta <- 2

idw_temp <- gstat(
  formula = Temperature ~ 1, # intercept-only model
  data = dwd_east_sp,
  nmax = neighbors,
  set = list(idp = beta)
)

Having specified the gstat object, we use the generic predict() function to interpolate our predictor variable for each pixel in our SpatialPixelsDataFrame object grid_east_germany.

grid_east_germany_temp <- predict(
  object = idw_temp,
  newdata = grid_east_germany
)
## [inverse distance weighted interpolation]

The resulting SpatialPixelsDataFrame object can be plotted using the generic plot() command. We add additional layers by calling the plot() command again, but now adding the optional argument add = TRUE to the function call.

library(colorRamps)
plot(grid_east_germany_temp,
  col = matlab.like(700),
  main = "IDW Temperatur (°C)"
)
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")

In order to clip the SpatialPixelsDataFrame object grid_east_germany_temp, we first create a RasterLayer object using the raster() function, and then we apply the mask() function.

# masking
grid_east_germany_temp <- mask(
  raster(grid_east_germany_temp),
  east_germany_states_sp
)

We plot the clipped interpolation raster.

# plotting
main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")

Exercise: Repeat the interpolation of the temperature data with \(\beta = 1\) and compare the results. Which choice of \(\beta\) leads to more plausible results?

## Your code here...
Show code
# set parameters
neighbors <- length(dwd_east_sp)
beta_2 <- 1

# build model
idw_temp_2 <- gstat(
  formula = Temperature ~ 1, # intercept-only model
  data = dwd_east_sp,
  nmax = neighbors,
  set = list(idp = beta_2)
)

# interpolation
grid_east_germany_temp_2 <- predict(
  object = idw_temp_2,
  newdata = grid_east_germany
)
## [inverse distance weighted interpolation]
# masking
grid_east_germany_temp_2 <- mask(
  raster(grid_east_germany_temp_2),
  east_germany_states_sp
)

# plotting
par(mfrow = c(1, 2))

main_2 <- expression(paste("IDW Temperatur (°C), ", beta, " = 1"))
plot(grid_east_germany_temp_2, main = main_2, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")

main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")



Interpolation of rainfall data using IDW

In the previous section we learned all necessary ingredients to apply the IDW approach. Let us advance and compute the IDW interpolation for the mean annual rainfall in East Germany, using the dwd_east_sp data set.

Exercise: Calculate the IDW interpolation for the mean annual rainfall in East Germany.

## Your code here...
Show code
# set parameters
neighbors <- length(dwd_east_sp)
beta <- 2

# build model
idw_rain <- gstat(
  formula = Rainfall ~ 1, # intercept only model
  data = dwd_east_sp,
  nmax = neighbors,
  set = list(idp = beta)
)

# interpolation
grid_east_germany_rain <- predict(
  object = idw_rain,
  newdata = grid_east_germany,
  debug.level = 0
)
# masking
grid_east_germany_rain <- mask(raster(grid_east_germany_rain), east_germany_states_sp)

# plotting
library(RColorBrewer)
main <- expression(paste("IDW Rainfall (mm)"))
plot(grid_east_germany_rain, main = main, col = brewer.pal(n = 7, name = "YlGnBu"))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "red")



Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.