In the subsequent sections we will work with weather data provided by Deutscher Wetterdienst (DWD) (German Weather Service). Here we provide a preprocessed data set of DWD weather stations located across Germany. The data was downloaded from the DWD (German Weather Service) data portal on October 05, 2022. You may find a detailed description of the data set here. Please note that for the purpose of this tutorial the data set was preprocessed and columns have been renamed.
You may download the DWD.csv
file here.
We import the data set and assign a proper name to it.
dwd <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/DWD.csv",
encoding = "latin1"
)
The dwd data set consists of 599 rows, each of them representing a particular weather station in Germany, and 22 columns, each of them corresponding to a variable or feature related to that particular weather station. These self-explaining variables are: ID, DWD_ID, STATION_NAME, FEDERAL_STATE, LAT, LON, ALTITUDE, PERIOD, RECORD_LENGTH, MEAN_ANNUAL_AIR_TEMP, MEAN_MONTHLY_MAX_TEMP, MEAN_MONTHLY_MIN_TEMP, MEAN_ANNUAL_WIND_SPEED, MEAN_CLOUD_COVER, MEAN_ANNUAL_SUNSHINE, MEAN_ANNUAL_RAINFALL, MAX_MONTHLY_WIND_SPEED, MAX_AIR_TEMP, MAX_WIND_SPEED, MAX_RAINFALL, MIN_AIR_TEMP, MEAN_RANGE_AIR_TEMP.
For the purpose of the tutorial we are only interested in the
variables mean_annual_rainfall
,
mean_annual_air_temp
and Altitude
. Hence, we
subset or data set based on the variables. Further, we make sure that we
exclude all missing values.
dwd <- dwd[, c("LAT", "LON", "MEAN_ANNUAL_AIR_TEMP", "MEAN_ANNUAL_RAINFALL", "ALTITUDE")]
dwd <- dwd[complete.cases(dwd), ]
nrow(dwd)
## [1] 585
After cleaning up there are 585 observations left in our data set. In
the next step we create a sf
(simple feature) object from
the data set. Note that we provide the additional argument crs = 4326
to the function call as the coordinates in the data set are given as
geographic coordinates in decimal degrees. Thereafter, we transform the
sf
object into the ETRS89/LAEA coordinate reference system (European
Terrestrial Reference System 1989/Lambert Azimuthal Equal-Area
projection coordinate reference system) providing the EPSG
identifier \(3035\). Thereafter we cast the
sf
object to a sp
object for later usage.
library(sf)
dwd_sf <- st_as_sf(dwd, coords = c("LON", "LAT"), crs = 4326)
# transform to ETRS89/LAEA Europe
dwd_sf <- st_transform(dwd_sf, 3035)
dwd_sp <- as(dwd_sf, "Spatial")
Before we continue we should remind ourselves that the data set we
are working with has a spatial component. Basically, our observations
are point measurements of rainfall spread across Germany. Let us plot a
simple map to visualize the spatial distribution of our observations.
Therefore we rely on the raster
package and on the
ggplot2
package.
library(raster)
library(tidyverse)
library(mapproj)
# Retrieve Federal States by the the getData() function from the raster package
G1 <- getData(country = "Germany", level = 1)
# plot the map
ggplot() +
geom_polygon(
data = G1,
aes(x = long, y = lat, group = group),
colour = "grey10", fill = "#fff7bc"
) +
geom_point(
data = dwd,
aes(x = LON, y = LAT),
alpha = .5,
size = 1
) +
theme_bw() +
xlab("Longitude") +
ylab("Latitude") +
coord_map()
For later usage we store the spatial object dwd_sp
on
disk.
save(file = "dwd_sp.RData", dwd_sp)
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.