In the subsequent sections we will work with weather data provided by Deutscher Wetterdienst (DWD) (German Weather Service). Here we provide a preprocessed data set of DWD weather stations located across Germany. The data was downloaded from the DWD (German Weather Service) data portal on October 05, 2022. You may find a detailed description of the data set here. Please note that for the purpose of this tutorial the data set was preprocessed and columns have been renamed.

You may download the DWD.csv file here. We import the data set and assign a proper name to it.

dwd <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/DWD.csv",
  encoding = "latin1"
)

The dwd data set consists of 599 rows, each of them representing a particular weather station in Germany, and 22 columns, each of them corresponding to a variable or feature related to that particular weather station. These self-explaining variables are: ID, DWD_ID, STATION_NAME, FEDERAL_STATE, LAT, LON, ALTITUDE, PERIOD, RECORD_LENGTH, MEAN_ANNUAL_AIR_TEMP, MEAN_MONTHLY_MAX_TEMP, MEAN_MONTHLY_MIN_TEMP, MEAN_ANNUAL_WIND_SPEED, MEAN_CLOUD_COVER, MEAN_ANNUAL_SUNSHINE, MEAN_ANNUAL_RAINFALL, MAX_MONTHLY_WIND_SPEED, MAX_AIR_TEMP, MAX_WIND_SPEED, MAX_RAINFALL, MIN_AIR_TEMP, MEAN_RANGE_AIR_TEMP.

For the purpose of the tutorial we are only interested in the variables mean_annual_rainfall, mean_annual_air_temp and Altitude. Hence, we subset or data set based on the variables. Further, we make sure that we exclude all missing values.

dwd <- dwd[, c("LAT", "LON", "MEAN_ANNUAL_AIR_TEMP", "MEAN_ANNUAL_RAINFALL", "ALTITUDE")]
dwd <- dwd[complete.cases(dwd), ]
nrow(dwd)
## [1] 585

After cleaning up there are 585 observations left in our data set. In the next step we create a sf (simple feature) object from the data set. Note that we provide the additional argument crs = 4326 to the function call as the coordinates in the data set are given as geographic coordinates in decimal degrees. Thereafter, we transform the sf object into the ETRS89/LAEA coordinate reference system (European Terrestrial Reference System 1989/Lambert Azimuthal Equal-Area projection coordinate reference system) providing the EPSG identifier \(3035\). Thereafter we cast the sf object to a sp object for later usage.

library(sf)
dwd_sf <- st_as_sf(dwd, coords = c("LON", "LAT"), crs = 4326)
# transform to ETRS89/LAEA Europe
dwd_sf <- st_transform(dwd_sf, 3035)
dwd_sp <- as(dwd_sf, "Spatial")

Before we continue we should remind ourselves that the data set we are working with has a spatial component. Basically, our observations are point measurements of rainfall spread across Germany. Let us plot a simple map to visualize the spatial distribution of our observations. Therefore we rely on the raster package and on the ggplot2 package.

library(raster)
library(tidyverse)
library(mapproj)
# Retrieve Federal States by the the getData() function from the raster package
G1 <- getData(country = "Germany", level = 1)

# plot the map
ggplot() +
  geom_polygon(
    data = G1,
    aes(x = long, y = lat, group = group),
    colour = "grey10", fill = "#fff7bc"
  ) +
  geom_point(
    data = dwd,
    aes(x = LON, y = LAT),
    alpha = .5,
    size = 1
  ) +
  theme_bw() +
  xlab("Longitude") +
  ylab("Latitude") +
  coord_map()

For later usage we store the spatial object dwd_sp on disk.

save(file = "dwd_sp.RData", dwd_sp)

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.