First, we load the required data sets and packages:
load(url("https://userpage.fu-berlin.de/soga/data/r-data/East_Germany.RData"))
library(raster)
library(gstat)
In this section, we apply the inverse distance weighting (IDW) approach
to interpolate weather data based on measurements from 71 DWD weather
stations located in East Germany. The data set we use
(dwd_east_sp
) icludes the mean annual rainfall and the mean
annual temperature over the period 1981–2010.
plot(dwd_east_sp)
plot(east_germany_states_sp, add = TRUE, border = "red")
In most real-world applications, we do not interpolate one particular point in space, but we compute the values of our predictor variable on a regular grid. In our example, we build a regular grid with a cell size of 10 km covering the area of East Germany.
In order to set the geographical extend of our grid, we take the
bounding box of the east.germany.sp
spatial data set.
extent_east_germany <- extent(east_germany_states_sp)
extent_east_germany
## class : Extent
## xmin : 137852.8
## xmax : 502916.9
## ymin : 5561109
## ymax : 6060829
Based on the extent of the boundary box, we use the
expand.grid()
function to generate a data frame with
x
and y
coordinates.
# set the grid cell size in meter
xh <- 10000
grid_east_germany <- expand.grid(
x = seq(
from = round(extent_east_germany@xmin),
to = round(extent_east_germany@xmax),
by = xh
),
y = seq(
from = round(extent_east_germany@ymin),
to = round(extent_east_germany@ymax),
by = xh
)
)
head(grid_east_germany, 10)
## x y
## 1 137853 5561109
## 2 147853 5561109
## 3 157853 5561109
## 4 167853 5561109
## 5 177853 5561109
## 6 187853 5561109
## 7 197853 5561109
## 8 207853 5561109
## 9 217853 5561109
## 10 227853 5561109
tail(grid_east_germany, 10)
## x y
## 1841 407853 6051109
## 1842 417853 6051109
## 1843 427853 6051109
## 1844 437853 6051109
## 1845 447853 6051109
## 1846 457853 6051109
## 1847 467853 6051109
## 1848 477853 6051109
## 1849 487853 6051109
## 1850 497853 6051109
For better understating, we plot the generated grid points.
plot(grid_east_germany, asp = 1)
Since our grid is an object of the class data.frame
,
class(grid_east_germany)
## [1] "data.frame"
we convert it into a SpatialPoints
object using the
coordinates()
function.
coordinates(grid_east_germany) <- ~ x + y
class(grid_east_germany)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"
In the next step, we assign a spatial reference system to our grid. For this, we
use the proj4string()
function. We may either provide a
Proj4 format string or simply provide the EPSG identifier.
While proj4string()
can return the projection string of a
spatial object, it also allows us to modify the current spatial
reference system. This makes it is possible to extract the
Proj4 format string from an object and assign it to another
object.
Here, we extract the spatial reference system from the
dwd_east_sp
and assign it to our grid
grid_east_germany
by applying the
proj4string()
function on both objects.
proj4string(grid_east_germany) <- proj4string(dwd_east_sp)
head(grid_east_germany)
## class : SpatialPoints
## features : 1
## extent : 137853, 137853, 5561109, 5561109 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs
Then, we apply the gridded()
function to convert the
SpatialPoints
to a SpatialPixels
object.
gridded(grid_east_germany) <- TRUE
class(grid_east_germany)
## [1] "SpatialPixels"
## attr(,"package")
## [1] "sp"
Finally, we plot the regular 10 km grid together with the administrative borders of East Germany’s federal states and the locations of the DWD weather stations.
plot(grid_east_germany,
main = paste("Weather Stations in East Germany\n and Interpolation Grid"),
col = "grey",
cex.main = 0.9
)
plot(east_germany_states_sp, add = TRUE, border = "red")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "blue")
In this section, we interpolate the temperature data given in
dwd_east_sp
using the IDW method. To apply this method, we
use the gstat
package. This package provides
extensive capability for univariate and multivariate spatial and
spatio-temporal geostatistical modelling, prediction and simulation. For
an overview of the gstat
package, see the official
introductory document here.
The main function in the gstat
package is the
equivocally named function gstat()
. The function returns an
object of class gstat
that holds all information to perform
spatial interpolation. The gstat()
function accepts
numerous parameters (type ?gstat
into your console for more
details). In order to implement the IDW approach with the
gstat()
function we have to specify a formula
(an object of the formula
class) that defines the dependent
and independent variables. In addition, we have to provide the
data
corresponding to the measured point data (e.g. a
SpatialPointsDataFrame
object), a list with the \(\beta\) parameter (set
) and
optionally, the number of nearest point observations (nmax
)
that should be used for the model. Since IDW is an intercept-only model
(a model without independent variables), the formula in R is
onventionally written as \(\sim
1\).
Consequently, we specify our IDW model as follows:
neighbors <- length(dwd_east_sp)
beta <- 2
idw_temp <- gstat(
formula = Temperature ~ 1, # intercept-only model
data = dwd_east_sp,
nmax = neighbors,
set = list(idp = beta)
)
Having specified the gstat
object, we use the generic
predict()
function to interpolate our predictor variable
for each pixel in our SpatialPixelsDataFrame
object
grid_east_germany
.
grid_east_germany_temp <- predict(
object = idw_temp,
newdata = grid_east_germany
)
## [inverse distance weighted interpolation]
The resulting SpatialPixelsDataFrame
object can be
plotted using the generic plot()
command. We add additional
layers by calling the plot()
command again, but now adding
the optional argument add = TRUE
to the function call.
library(colorRamps)
plot(grid_east_germany_temp,
col = matlab.like(700),
main = "IDW Temperatur (°C)"
)
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
In order to clip the SpatialPixelsDataFrame
object
grid_east_germany_temp
, we first create a
RasterLayer
object using the raster()
function, and then we apply the mask()
function.
# masking
grid_east_germany_temp <- mask(
raster(grid_east_germany_temp),
east_germany_states_sp
)
We plot the clipped interpolation raster.
# plotting
main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
Exercise: Repeat the interpolation of the temperature data with \(\beta = 1\) and compare the results. Which choice of \(\beta\) leads to more plausible results?
## Your code here...
# set parameters
neighbors <- length(dwd_east_sp)
beta_2 <- 1
# build model
idw_temp_2 <- gstat(
formula = Temperature ~ 1, # intercept-only model
data = dwd_east_sp,
nmax = neighbors,
set = list(idp = beta_2)
)
# interpolation
grid_east_germany_temp_2 <- predict(
object = idw_temp_2,
newdata = grid_east_germany
)
## [inverse distance weighted interpolation]
# masking
grid_east_germany_temp_2 <- mask(
raster(grid_east_germany_temp_2),
east_germany_states_sp
)
# plotting
par(mfrow = c(1, 2))
main_2 <- expression(paste("IDW Temperatur (°C), ", beta, " = 1"))
plot(grid_east_germany_temp_2, main = main_2, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
main <- expression(paste("IDW Temperatur (°C), ", beta, " = 2"))
plot(grid_east_germany_temp, main = main, col = matlab.like(700))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "black")
In the previous section we learned all necessary ingredients to apply
the IDW approach. Let us advance and compute the IDW interpolation for
the mean annual rainfall in East Germany, using the
dwd_east_sp
data set.
Exercise: Calculate the IDW interpolation for the mean annual rainfall in East Germany.
## Your code here...
# set parameters
neighbors <- length(dwd_east_sp)
beta <- 2
# build model
idw_rain <- gstat(
formula = Rainfall ~ 1, # intercept only model
data = dwd_east_sp,
nmax = neighbors,
set = list(idp = beta)
)
# interpolation
grid_east_germany_rain <- predict(
object = idw_rain,
newdata = grid_east_germany,
debug.level = 0
)
# masking
grid_east_germany_rain <- mask(raster(grid_east_germany_rain), east_germany_states_sp)
# plotting
library(RColorBrewer)
main <- expression(paste("IDW Rainfall (mm)"))
plot(grid_east_germany_rain, main = main, col = brewer.pal(n = 7, name = "YlGnBu"))
plot(east_germany_states_sp, add = TRUE, border = "white")
plot(dwd_east_sp, add = TRUE, pch = 19, cex = 0.5, col = "red")
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.