First, we load the required data sets and packages:

load(url("https://userpage.fu-berlin.de/soga/data/r-data/East_Germany.RData"))
library(raster)
library(gstat)

In this section we apply the inverse distance weighting (IDW) approach to interpolate weather data on the basis of measurements taken at 71 DWD weather stations in East Germany. The data set we use (dwd_east_sp) contains the mean annual rainfall and the mean annual temperature for the period from 1981–2010.

plot(dwd_east_sp)
## Please note that rgdal will be retired during October 2023,
## plan transition to sf/stars/terra functions using GDAL and PROJ
## at your earliest convenience.
## See https://r-spatial.org/r/2023/05/15/evolution4.html and https://github.com/r-spatial/evolution
## rgdal: version: 1.6-7, (SVN revision 1203)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.5.2, released 2022/09/02
## Path to GDAL shared files: C:/Users/johan/AppData/Local/R/cache/R/renv/cache/v5/R-4.2/x86_64-w64-mingw32/rgdal/1.6-7/10b777236c9e7855bc9dea8e347e30b7/rgdal/gdal
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 8.2.1, January 1st, 2022, [PJ_VERSION: 821]
## Path to PROJ shared files: C:/Users/johan/AppData/Local/R/cache/R/renv/cache/v5/R-4.2/x86_64-w64-mingw32/rgdal/1.6-7/10b777236c9e7855bc9dea8e347e30b7/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.6-1
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
plot(east_germany_states_sp, add = TRUE, border = "red")


Meshgrid for interpolation

In most real life use cases we do not interpolate one particular point in space, but we compute the values of our predictor variable on a regular grid. In our case we build a regular grid with a cell size of 10 km covering the area of East Germany.

In order to set the geographical extend of our grid we take the bounding box of the east.germany.sp spatial data set.

extent_east_germany <- extent(east_germany_states_sp)
extent_east_germany
## class      : Extent 
## xmin       : 137852.8 
## xmax       : 502916.9 
## ymin       : 5561109 
## ymax       : 6060829

Based on the extent of the boundary box we use the expand.grid() function to generate a data frame with x and y coordinates.

# set the grid cell size in meter
xh <- 10000

grid_east_germany <- expand.grid(
  x = seq(
    from = round(extent_east_germany@xmin),
    to = round(extent_east_germany@xmax),
    by = xh
  ),
  y = seq(
    from = round(extent_east_germany@ymin),
    to = round(extent_east_germany@ymax),
    by = xh
  )
)

head(grid_east_germany, 10)
##         x       y
## 1  137853 5561109
## 2  147853 5561109
## 3  157853 5561109
## 4  167853 5561109
## 5  177853 5561109
## 6  187853 5561109
## 7  197853 5561109
## 8  207853 5561109
## 9  217853 5561109
## 10 227853 5561109
tail(grid_east_germany, 10)
##           x       y
## 1841 407853 6051109
## 1842 417853 6051109
## 1843 427853 6051109
## 1844 437853 6051109
## 1845 447853 6051109
## 1846 457853 6051109
## 1847 467853 6051109
## 1848 477853 6051109
## 1849 487853 6051109
## 1850 497853 6051109

For a better understating we plot the generated grid points.

plot(grid_east_germany, asp = 1)

As our grid is an object of class data.frame,

class(grid_east_germany)
## [1] "data.frame"

we cast it into a SpatialPoints object by applying the coordinates() function.

coordinates(grid_east_germany) <- ~ x + y
class(grid_east_germany)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"

In the next step we assign a spatial reference system to our grid. Therefore we use the proj4string() function. We may either provide a Proj4 format string or simply provide the EPSG identifier. The proj4string() function returns the projection string of a spatial object, but it also allows us to alter the current spatial reference system. Hence, it is possible to extract the Proj4 format string from an object and assign it to another object.

Here, we extract the spatial reference system from the dwd_east_sp and assign it to our grid grid_east_germany by applying the proj4string() function on both objects.

proj4string(grid_east_germany) <- proj4string(dwd_east_sp)
head(grid_east_germany)
## class       : SpatialPoints 
## features    : 1 
## extent      : 137853, 137853, 5561109, 5561109  (xmin, xmax, ymin, ymax)
## crs         : +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs

Then we apply the gridded() function to cast the SpatialPoints to a SpatialPixels object.

gridded(grid_east_germany) <- TRUE
class(grid_east_germany)
## [1] "SpatialPixels"
## attr(,"package")
## [1] "sp"

Finally, we plot our regular grid of cell size 10 km, the administrative borders of the federal states within East Germany and the DWD weather stations.

plot(grid_east_germany,
  main = paste("Weather Stations in East Germany\n and Interpolation Grid"),
  col = "grey",
  cex.main = 0.9
)

plot(east_germany_states_sp, add = TRUE, border = "red")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "blue")


Interpolation of temperature data using IDW

In this section, we interpolate the temperature data given in dwd_east_sp using the IDW method. To apply this method, we use the gstat package. This package provides extensive capability for univariate and multivariate spatial and spatio-temporal geostatistical modelling, prediction and simulation. For an overview of the gstat package, see the official introductory document here.

The main function in the gstat package is the equivocally named function gstat(). The function returns an object of class gstat that holds all information to perform spatial interpolation. The gstat() function accepts numerous parameters (type ?gstat into your console for more details). In order to implement the IDW approach with the gstat() function we have to specify a formula (an object of the formula class) that defines the dependent and independent variables. In addition, we have to provide the data corresponding to the measured point data (e.g. a SpatialPointsDataFrame object), a list with the \(\beta\) parameter (set) and optionally, the number of nearest point observations (nmax) that should be used for the model. As an IDW model is an intercept-only model (that is a model with no independent variables), the formula in R is by convention specified by \(\sim 1\).

Consequently, we specify our IDW model as follows:

neighbors <- length(dwd_east_sp)
beta <- 2

idw_temp <- gstat(
  formula = Temperature ~ 1, # intercept-only model
  data = dwd_east_sp,
  nmax = neighbors,
  set = list(idp = beta)
)

Now that we have specified a gstat object, we use the generic predict() function to interpolate our predictor variable for each pixel in our SpatialPixelsDataFrame object grid_east_germany.

grid_east_germany_temp <- predict(
  object = idw_temp,
  newdata = grid_east_germany
)
## [inverse distance weighted interpolation]

The resulting SpatialPixelsDataFrame object can be plotted using the generic plot() command. We add additional layers by calling the plot() command again, but now adding the optional argument add = TRUE to the function call.

library(colorRamps)
plot(grid_east_germany_temp,
  col = matlab.like(700),
  main = "IDW Temperatur (°C)"
)
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")

In order to clip the SpatialPixelsDataFrame object grid_east_germany_temp, we first create a RasterLayer object, using the raster() function, and then we apply the mask() function.

# masking
grid_east_germany_temp <- mask(
  raster(grid_east_germany_temp),
  east_germany_states_sp
)

We plot the clipped interpolation raster.

# plotting
main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")

Exercise: Repeat the interpolation of the temperature data with \(\beta = 1\) and compare the results. Which choice of \(\beta\) leads to more plausible results?

## Your code here...
Show code
# set parameters
neighbors <- length(dwd_east_sp)
beta_2 <- 1

# build model
idw_temp_2 <- gstat(
  formula = Temperature ~ 1, # intercept-only model
  data = dwd_east_sp,
  nmax = neighbors,
  set = list(idp = beta_2)
)

# interpolation
grid_east_germany_temp_2 <- predict(
  object = idw_temp_2,
  newdata = grid_east_germany
)
## [inverse distance weighted interpolation]
# masking
grid_east_germany_temp_2 <- mask(
  raster(grid_east_germany_temp_2),
  east_germany_states_sp
)

# plotting
par(mfrow = c(1, 2))

main_2 <- expression(paste("IDW Temperatur (°C), ", beta, " = 1"))
plot(grid_east_germany_temp_2, main = main_2, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")

main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")



Interpolation of rainfall data using IDW

In the previous section we learned all necessary ingredients to apply the IDW approach. Let us advance and compute the IDW interpolation for the mean annual rainfall in East Germany, using the dwd_east_sp data set.

Exercise: Calculate the IDW interpolation for the mean annual rainfall data in East Germany.

## Your code here...
Show code
# set parameters
neighbors <- length(dwd_east_sp)
beta <- 2

# build model
idw_rain <- gstat(
  formula = Rainfall ~ 1, # intercept only model
  data = dwd_east_sp,
  nmax = neighbors,
  set = list(idp = beta)
)

# interpolation
grid_east_germany_rain <- predict(
  object = idw_rain,
  newdata = grid_east_germany,
  debug.level = 0
)
# masking
grid_east_germany_rain <- mask(raster(grid_east_germany_rain), east_germany_states_sp)

# plotting
library(RColorBrewer)
main <- expression(paste("IDW Rainfall (mm)"))
plot(grid_east_germany_rain, main = main, col = brewer.pal(n = 7, name = "YlGnBu"))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "red")



Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.