First, we load the required data sets and packages:
load(url("https://userpage.fu-berlin.de/soga/data/r-data/East_Germany.RData"))
library(raster)
library(gstat)
In this section we apply the inverse distance weighting (IDW) approach
to interpolate weather data on the basis of measurements taken at 71 DWD
weather stations in East Germany. The data set we use
(dwd_east_sp
) contains the mean annual rainfall and the
mean annual temperature for the period from 1981–2010.
plot(dwd_east_sp)
## Please note that rgdal will be retired during October 2023,
## plan transition to sf/stars/terra functions using GDAL and PROJ
## at your earliest convenience.
## See https://r-spatial.org/r/2023/05/15/evolution4.html and https://github.com/r-spatial/evolution
## rgdal: version: 1.6-7, (SVN revision 1203)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.5.2, released 2022/09/02
## Path to GDAL shared files: C:/Users/johan/AppData/Local/R/cache/R/renv/cache/v5/R-4.2/x86_64-w64-mingw32/rgdal/1.6-7/10b777236c9e7855bc9dea8e347e30b7/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ runtime: Rel. 8.2.1, January 1st, 2022, [PJ_VERSION: 821]
## Path to PROJ shared files: C:/Users/johan/AppData/Local/R/cache/R/renv/cache/v5/R-4.2/x86_64-w64-mingw32/rgdal/1.6-7/10b777236c9e7855bc9dea8e347e30b7/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.6-1
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
plot(east_germany_states_sp, add = TRUE, border = "red")
In most real life use cases we do not interpolate one particular point in space, but we compute the values of our predictor variable on a regular grid. In our case we build a regular grid with a cell size of 10 km covering the area of East Germany.
In order to set the geographical extend of our grid we take the
bounding box of the east.germany.sp
spatial data set.
extent_east_germany <- extent(east_germany_states_sp)
extent_east_germany
## class : Extent
## xmin : 137852.8
## xmax : 502916.9
## ymin : 5561109
## ymax : 6060829
Based on the extent of the boundary box we use the
expand.grid()
function to generate a data frame with
x
and y
coordinates.
# set the grid cell size in meter
xh <- 10000
grid_east_germany <- expand.grid(
x = seq(
from = round(extent_east_germany@xmin),
to = round(extent_east_germany@xmax),
by = xh
),
y = seq(
from = round(extent_east_germany@ymin),
to = round(extent_east_germany@ymax),
by = xh
)
)
head(grid_east_germany, 10)
## x y
## 1 137853 5561109
## 2 147853 5561109
## 3 157853 5561109
## 4 167853 5561109
## 5 177853 5561109
## 6 187853 5561109
## 7 197853 5561109
## 8 207853 5561109
## 9 217853 5561109
## 10 227853 5561109
tail(grid_east_germany, 10)
## x y
## 1841 407853 6051109
## 1842 417853 6051109
## 1843 427853 6051109
## 1844 437853 6051109
## 1845 447853 6051109
## 1846 457853 6051109
## 1847 467853 6051109
## 1848 477853 6051109
## 1849 487853 6051109
## 1850 497853 6051109
For a better understating we plot the generated grid points.
plot(grid_east_germany, asp = 1)
As our grid is an object of class data.frame
,
class(grid_east_germany)
## [1] "data.frame"
we cast it into a SpatialPoints
object by applying the
coordinates()
function.
coordinates(grid_east_germany) <- ~ x + y
class(grid_east_germany)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"
In the next step we assign a spatial reference system to our grid. Therefore we
use the proj4string()
function. We may either provide a
Proj4 format string or simply provide the EPSG identifier.
The proj4string()
function returns the projection string of
a spatial object, but it also allows us to alter the current spatial
reference system. Hence, it is possible to extract the Proj4
format string from an object and assign it to another object.
Here, we extract the spatial reference system from the
dwd_east_sp
and assign it to our grid
grid_east_germany
by applying the
proj4string()
function on both objects.
proj4string(grid_east_germany) <- proj4string(dwd_east_sp)
head(grid_east_germany)
## class : SpatialPoints
## features : 1
## extent : 137853, 137853, 5561109, 5561109 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs
Then we apply the gridded()
function to cast the
SpatialPoints
to a SpatialPixels
object.
gridded(grid_east_germany) <- TRUE
class(grid_east_germany)
## [1] "SpatialPixels"
## attr(,"package")
## [1] "sp"
Finally, we plot our regular grid of cell size 10 km, the administrative borders of the federal states within East Germany and the DWD weather stations.
plot(grid_east_germany,
main = paste("Weather Stations in East Germany\n and Interpolation Grid"),
col = "grey",
cex.main = 0.9
)
plot(east_germany_states_sp, add = TRUE, border = "red")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "blue")
In this section, we interpolate the temperature data given in
dwd_east_sp
using the IDW method. To apply this method, we
use the gstat
package. This package provides
extensive capability for univariate and multivariate spatial and
spatio-temporal geostatistical modelling, prediction and simulation. For
an overview of the gstat
package, see the official
introductory document here.
The main function in the gstat
package is the
equivocally named function gstat()
. The function returns an
object of class gstat
that holds all information to perform
spatial interpolation. The gstat()
function accepts
numerous parameters (type ?gstat
into your console for more
details). In order to implement the IDW approach with the
gstat()
function we have to specify a formula
(an object of the formula
class) that defines the dependent
and independent variables. In addition, we have to provide the
data
corresponding to the measured point data (e.g. a
SpatialPointsDataFrame
object), a list with the \(\beta\) parameter (set
) and
optionally, the number of nearest point observations (nmax
)
that should be used for the model. As an IDW model is an intercept-only
model (that is a model with no independent variables), the formula in R
is by convention specified by \(\sim
1\).
Consequently, we specify our IDW model as follows:
neighbors <- length(dwd_east_sp)
beta <- 2
idw_temp <- gstat(
formula = Temperature ~ 1, # intercept-only model
data = dwd_east_sp,
nmax = neighbors,
set = list(idp = beta)
)
Now that we have specified a gstat
object, we use the
generic predict()
function to interpolate our predictor
variable for each pixel in our SpatialPixelsDataFrame
object grid_east_germany
.
grid_east_germany_temp <- predict(
object = idw_temp,
newdata = grid_east_germany
)
## [inverse distance weighted interpolation]
The resulting SpatialPixelsDataFrame
object can be
plotted using the generic plot()
command. We add additional
layers by calling the plot()
command again, but now adding
the optional argument add = TRUE
to the function call.
library(colorRamps)
plot(grid_east_germany_temp,
col = matlab.like(700),
main = "IDW Temperatur (°C)"
)
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
In order to clip the SpatialPixelsDataFrame
object
grid_east_germany_temp
, we first create a
RasterLayer
object, using the raster()
function, and then we apply the mask()
function.
# masking
grid_east_germany_temp <- mask(
raster(grid_east_germany_temp),
east_germany_states_sp
)
We plot the clipped interpolation raster.
# plotting
main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
Exercise: Repeat the interpolation of the temperature data with \(\beta = 1\) and compare the results. Which choice of \(\beta\) leads to more plausible results?
## Your code here...
# set parameters
neighbors <- length(dwd_east_sp)
beta_2 <- 1
# build model
idw_temp_2 <- gstat(
formula = Temperature ~ 1, # intercept-only model
data = dwd_east_sp,
nmax = neighbors,
set = list(idp = beta_2)
)
# interpolation
grid_east_germany_temp_2 <- predict(
object = idw_temp_2,
newdata = grid_east_germany
)
## [inverse distance weighted interpolation]
# masking
grid_east_germany_temp_2 <- mask(
raster(grid_east_germany_temp_2),
east_germany_states_sp
)
# plotting
par(mfrow = c(1, 2))
main_2 <- expression(paste("IDW Temperatur (°C), ", beta, " = 1"))
plot(grid_east_germany_temp_2, main = main_2, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
In the previous section we learned all necessary ingredients to apply
the IDW approach. Let us advance and compute the IDW interpolation for
the mean annual rainfall in East Germany, using the
dwd_east_sp
data set.
Exercise: Calculate the IDW interpolation for the mean annual rainfall data in East Germany.
## Your code here...
# set parameters
neighbors <- length(dwd_east_sp)
beta <- 2
# build model
idw_rain <- gstat(
formula = Rainfall ~ 1, # intercept only model
data = dwd_east_sp,
nmax = neighbors,
set = list(idp = beta)
)
# interpolation
grid_east_germany_rain <- predict(
object = idw_rain,
newdata = grid_east_germany,
debug.level = 0
)
# masking
grid_east_germany_rain <- mask(raster(grid_east_germany_rain), east_germany_states_sp)
# plotting
library(RColorBrewer)
main <- expression(paste("IDW Rainfall (mm)"))
plot(grid_east_germany_rain, main = main, col = brewer.pal(n = 7, name = "YlGnBu"))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "red")
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.