R provides a rich environment for cartographic plotting.


Maps with ggplot2

The powerful ggplot2 library is capable of producing quite nice looking maps. In order to showcase these capacities we will visualize the location of weather stations maintained by Deutscher Wetterdienst (DWD) (German Weather Service). The data was downloaded from the DWD data portal on April 21, 2017.

First, before plotting any data we have to prepare our data set. Hence, we start by importing the data set and assign it a proper name. In this example we use the read.csv2() function for loading the data.

dwd <- read.csv2("https://userpage.fu-berlin.de/soga/data/raw-data/DWD.csv",
  stringsAsFactors = FALSE, sep = ","
)

The dwd data set consists of 599 rows, each of them representing a particular weather station in Germany, and 22 columns, each of them corresponding to a variable/feature related to that particular weather station. For the purpose of this tutorial we are only interested in the variables LAT and LON, corresponding to the geographic coordinates of each particular weather station in the data set, as well as the record length in years (RECORD.LENGTH). Hence, we subset our data set based on these variables. Further, we make sure that we exclude all missing values.

## subset data
dwd <- dwd[, c("LAT", "LON", "RECORD_LENGTH")]
## drop missing values
dwd <- dwd[complete.cases(dwd), ]

dwd$LAT <- as.numeric(dwd$LAT)
dwd$LON <- as.numeric(dwd$LON)
nrow(dwd)
## [1] 599
head(dwd)
##       LAT     LON RECORD_LENGTH
## 1 47.8413  8.8493            55
## 2 50.7827  6.0941           160
## 3 52.9335  8.2370            45
## 4 48.2156  8.9784            30
## 5 48.6159 13.0506            64
## 6 52.4853  7.9126            55

For the purpose of visualization we categorize the variable RECORD.LENGTH into five groups:

To achieve this task we apply the mutate() function provided by the dplyr package. If you get an error message, type install.packages('dplyr') to install the package.

library(dplyr)
dwd <- dwd %>%
  mutate(RECORD_LENGTH_CATEGORY = cut(RECORD_LENGTH,
    breaks = c(-Inf, 10, 30, 60, 90, Inf),
    labels = c(
      "very short (<10)",
      "short (10-30)",
      "middle (30-60)",
      "long (60-90)",
      "very long (>90)"
    )
  ))

Now we are ready to plot a map to visualize the spatial distribution of the DWD weather stations. For this we rely on the raster package and on the ggplot2 package.

library(raster)
library(ggplot2)

If you get an error message saying that one or both of the packages are not found, type install.packages(c('raster', 'ggplot2')) to install the packages.

Basically, we only use the getData() function from the raster package. This function provides us with geographic data for Germany. Type help(getData) into your R console for further details.

# Retrieve Federal States by the the getData() function from the raster package
germany <- getData(country = "Germany", level = 1)
## Warning in getData(country = "Germany", level = 1): getData will be removed in a future version of raster
## . Please use the geodata package instead
## Please note that rgdal will be retired during October 2023,
## plan transition to sf/stars/terra functions using GDAL and PROJ
## at your earliest convenience.
## See https://r-spatial.org/r/2023/05/15/evolution4.html and https://github.com/r-spatial/evolution
## rgdal: version: 1.6-7, (SVN revision 1203)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.6.2, released 2023/01/02
## Path to GDAL shared files: C:/Users/mceck/AppData/Local/R/cache/R/renv/cache/v5/R-4.3/x86_64-w64-mingw32/rgdal/1.6-7/10b777236c9e7855bc9dea8e347e30b7/rgdal/gdal
##  GDAL does not use iconv for recoding strings.
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 9.2.0, March 1st, 2023, [PJ_VERSION: 920]
## Path to PROJ shared files: C:/Users/mceck/AppData/Local/R/cache/R/renv/cache/v5/R-4.3/x86_64-w64-mingw32/rgdal/1.6-7/10b777236c9e7855bc9dea8e347e30b7/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.6-1
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.

In the next step we build a ggplot2() graph, layer by layer.

library(mapproj)

ggplot() +
  geom_polygon(
    data = germany,
    aes(x = long, y = lat, group = group),
    colour = "grey10", fill = "#fff7bc"
  ) +
  geom_point(
    data = dwd,
    aes(x = LON, y = LAT, col = RECORD_LENGTH_CATEGORY),
    alpha = .5,
    size = 1.5
  ) + 
  coord_map() +
  theme_bw() +
  xlab("Longitude") +
  ylab("Latitude") +
  labs(color = "Record length") +
  ggtitle("DWD Weather Stations")


ggmap package

Source: Kahle and Wickham (2013) ggmap : Spatial Visualization with ggplot2, The R Journal Vol. 5/1

The basic idea driving ggmap is to take a downloaded map image, plot it as a context layer using ggplot2 and then plot additional content layers of data, statistics or models on top of the map. In ggmap this process is broken down into two pieces:

  1. downloading the images and formatting them for plotting, done with get_map() or get_stamenmap(), and
  2. making the plot, done with ggmap().

Once we have installed the ggmap package using install.packages("ggmap") we can load it to use its functions

library(ggmap)

Get the map

One way to obtain a raster object for plotting with ggmap is the get_map() function. Since this function uses the Google API which requires registration with the Google Cloud Platform and obtaining an API key, we will not enlarge upon it here. Instead, we will use the function get_stamenmap() which accesses a tile server for Stamen Maps, based on data by OpenStreetMap.

First, we have to decide on the extent of our map. For the purpose of this tutorial we map the neighborhood of GeoCampus Lankwitz, Berlin, Germany. In order to retrieve the geographic coordinates of GeoCampus Lankwitz we apply the geocode_OSM() function from the tmaptools() package. All we have to do is to feed the function with a string of the postal address and it returns a pair of coordinates using a public API from the OpenStreetMap called Nominatim.

After installing the tmaptools() package using install.packages("tmaptools"), we retrieve the coordinates for the GeoCampus Lankwitz using the geocode_OSM() function. Note that we specifically save the output as a data frame, so that we can extract only the lat and lon coordinates easily.

library(tmaptools)

geocampus <- geocode_OSM(q = "Malteserstrasse 74-100, Berlin", as.data.frame = TRUE)
geocampus_latlon <- geocampus[, c("lat", "lon")]
geocampus_latlon
##       lat      lon
## 1 52.4262 13.35803

The function get_stamenmap() requires a bounding box in the format c(lowerleftlon, lowerleftlat, upperrightlon, upperrightlat) as an input. Thus, we need to create such a bounding box around the geographic coordinates of GeoCampus Lankwitz.

berlin_bb <- c(
  left = geocampus_latlon$lon - 0.05,
  bottom = geocampus_latlon$lat - 0.02,
  right = geocampus_latlon$lon + 0.05,
  top = geocampus_latlon$lat + 0.04
)

Then we download the actual map as a raster object by calling the get_stamenmap() function. The zoom argument controls the level of detail in the map. The larger the number, the greater the detail. You can also play around with the appearance of the map by for example specifying maptype = "watercolor".

Finally, we plot the map using the ggmap() function.

berlin <- get_stamenmap(bbox = berlin_bb, zoom = 14)

berlin_map <- ggmap(berlin, extent = "device")
berlin_map
Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

Nice, there it is a map of the neighborhood of the GeoCampus Lankwitz.

Make the plot

The next steps aim to add content to the basemap above. We start by adding the actual location of GeoCampus Lankwitz, mark it with a red dot and annotate it.

p <- berlin_map +
  ## add point
  geom_point(aes(x = lon, y = lat),
    data = geocampus_latlon,
    col = "red",
    size = 2
  ) +
  ## add text
  geom_text(
    x = geocampus_latlon$lon + 0.005, # add a small number
    y = geocampus_latlon$lat + 0.0015, # add a small number
    label = "GeoCampus",
    size = 2
  )
p
Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

Looks good!

Now we add even more geographic content. We read in an ESRI shapefile with point data information. The shapefile osm_pois_p.shp stores points of interest in Berlin provided by OpenStreetMap (OSM). The data was downloaded from GEOFABRIK on June 25, 2017, and contains OpenStreetMap data as of June 22, 2017 (see here).

First we download the zipped shapefile to a temporary folder and extract it afterwards.

# build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "osm_pois_p.zip"

## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file(s)
unzip(temp)
## close file connection
unlink(temp)

After downloading and unzipping of the file, we open it by calling the readOGR() function from the rgdal package.

library(rgdal)
## read in the data
pois <- readOGR(
  dsn = ".", "osm_pois_p",
  use_iconv = TRUE, # enable encoding
  encoding = "UTF-8"
)

Finally, we use the fortify() function from the ggfortify package to transform the shapefile into a data.frame object.

library(ggfortify)
pois_df <- fortify(pois)
class(pois_df)
## [1] "data.frame"

The data set pois_df is quite large. It has 49091 rows (observations) and 6 feature columns.

nrow(pois_df)
## [1] 49091
ncol(pois_df)
## [1] 6
head(pois_df)
##     osm_id code              fclass       name     long      lat
## 1 16541597 2907 camera_surveillance       Aral 13.34544 52.54644
## 2 21677342 2701        tourist_info       <NA> 13.37086 52.51940
## 3 26735749 2301          restaurant       Aida 13.32283 52.50691
## 4 26735753 2006           telephone       <NA> 13.32212 52.50647
## 5 26735759 2301          restaurant Madame Ngo 13.31809 52.50621
## 6 26735763 2301          restaurant     Sakana 13.32078 52.50734

For now we are only interested in the variable fclass, which refers to the category of the points in space.

unique(pois_df$fclass)
##   [1] "camera_surveillance" "tourist_info"        "restaurant"         
##   [4] "telephone"           "biergarten"          "recycling_glass"    
##   [7] "kiosk"               "memorial"            "fire_station"       
##  [10] "bank"                "post_office"         "toilet"             
##  [13] "cinema"              "library"             "police"             
##  [16] "hospital"            "post_box"            "chemist"            
##  [19] "hotel"               "motel"               "arts_centre"        
##  [22] "cafe"                "fast_food"           "optician"           
##  [25] "convenience"         "car_sharing"         "shelter"            
##  [28] "viewpoint"           "attraction"          "pub"                
##  [31] "supermarket"         "bakery"              "sports_shop"        
##  [34] "pharmacy"            "atm"                 "playground"         
##  [37] "theatre"             "school"              "car_rental"         
##  [40] "gift_shop"           "museum"              "guesthouse"         
##  [43] "beverages"           "bar"                 "travel_agent"       
##  [46] "embassy"             "butcher"             "pitch"              
##  [49] "stationery"          "outdoor_shop"        "kindergarten"       
##  [52] "fountain"            "university"          "monument"           
##  [55] "beauty_shop"         "sports_centre"       "hostel"             
##  [58] "tower"               "bench"               "waste_basket"       
##  [61] "recycling_clothes"   "artwork"             "nightclub"          
##  [64] "windmill"            "drinking_water"      "computer_shop"      
##  [67] "furniture_shop"      "recycling"           "community_centre"   
##  [70] "bicycle_shop"        "car_dealership"      "vending_any"        
##  [73] "public_building"     "bookshop"            "florist"            
##  [76] "doityourself"        "laundry"             "park"               
##  [79] "comms_tower"         "hairdresser"         "clothes"            
##  [82] "mobile_phone_shop"   "bicycle_rental"      "video_shop"         
##  [85] "picnic_site"         "jeweller"            "college"            
##  [88] "veterinary"          "dentist"             "water_tower"        
##  [91] "hunting_stand"       "vending_machine"     "swimming_pool"      
##  [94] "track"               "newsagent"           "shoe_shop"          
##  [97] "mall"                "doctors"             "vending_cigarette"  
## [100] "courthouse"          "greengrocer"         "town_hall"          
## [103] "toy_shop"            "nursing_home"        "recycling_paper"    
## [106] "garden_centre"       "department_store"    "vending_parking"    
## [109] "theme_park"          "car_wash"            "graveyard"          
## [112] "water_well"          "water_works"         "chalet"             
## [115] "ruins"               "prison"              "zoo"                
## [118] "battlefield"         "stadium"             "archaeological"     
## [121] "camp_site"           "observation_tower"   "wayside_shrine"     
## [124] "alpine_hut"          "castle"              "food_court"         
## [127] "dog_park"            "water_mill"          "caravan_site"       
## [130] "golf_course"

In order to shrink the volume of the data set we restrict the entries to be either of type biergarten (German for open-air pub) or fast_food.

selection <- pois_df[pois_df$fclass %in% c("biergarten", "fast_food"), ]

Now let us plot our selection, by adding a geom_point() to our graph p.

p <- p +
  geom_point(aes(x = long, y = lat, colour = fclass),
    data = selection
  )
p
Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

Interestingly, there are quite a few locations in the near distance, however, very close to GeoCampus Lankwitz there are very few.

Okay, then where to? Should we head North, South, East or West?

Well, let us make a density map which shows where the entries for our selection (biergarten and fast_food) are somehow clustered. ggplot2 offers the stat_density2d() function, which performs a 2D kernel density estimation and displays the results with contours.

p +
  stat_density2d(aes(x = long, y = lat),
    bins = 2.6, alpha = 0.3,
    data = selection, geom = "polygon"
  )
Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

Well, I guess we should probably head north-west!


mapview package

The mapview package provides functions to very quickly and conveniently create interactive visualizations of spatial data. It allows quick interactive plotting to examine and visually investigate both aspects of spatial data, the geometries and their attributes. Visit the project’s website for more information and a quick tour through the functionality of this awesome package.

Our goal is to localize GeoCampus Lankwitz, Berlin, Germany and plot the location onto an interactive map.

First, we have to determine the geographical location of GeoCampus Lankwitz. Therefore, we make use of the geocode_OSM() function in the tmaptools package. This function allows us to geocode a location (find its latitude and longitude) using OpenStreetMap Nominatim. If you have not already, install the tmaptools package using install.packages("tmaptools").

Let us retrieve the coordinates for the GeoCampus Lankwitz using the geocode_OSM() function as a data frame. We then extract only the lat and lon coordinates from the result.

library(tmaptools)

geocampus <- geocode_OSM(q = "Malteserstrasse 74-100, Berlin", as.data.frame = TRUE)
geocampus_latlon <- geocampus[, c("lat", "lon")]
geocampus_latlon
##       lat      lon
## 1 52.4262 13.35803

Nice!

In the next step we transform the geocampus date frame object into a simple feature object. Therefore we apply the st_as_sf() function from the sf package. In addition we provide that object with a spatial coordinate reference system. In our example we use the EPSG identifier \(4326\), which corresponds to the World Geodetic System (WGS84).

library(sf)
geocampus_sf <- st_as_sf(
  x = geocampus_latlon, coords = c("lon", "lat"),
  crs = 4326
)
geocampus_sf
## Simple feature collection with 1 feature and 0 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 13.35803 ymin: 52.4262 xmax: 13.35803 ymax: 52.4262
## Geodetic CRS:  WGS 84
##                   geometry
## 1 POINT (13.35803 52.4262)

Once we have a spatial feature it is very easy to plot it using the mapview() function from the same-titled mapview package. Note that the mapview() function has many additional arguments (type help(mapview) into the R console), allowing for customization. Additionally, we can use the leafpop package to embed tables, images or graphs in pop-ups of our interactive map (Feel free to scroll over the blue marker in the map and click on it!).

library(mapview)
img <- "http://www.osa.fu-berlin.de/geographie_bsc/_medien/bilder_geocampus/bild_010/010_546.jpeg"

mapview(geocampus_sf,
  map.types = "Esri.WorldImagery",
  popup = leafpop:::popupIframe(img),
  label = "GeoCampus Lankwitz"
)

leaflet package

Another great package for producing interactive maps in R is the leaflet package. For more information and a thorough introduction we recommend to review its website Leaflet for R.

We will reproduce the same example as we did above using the mapview package.

In order to get started we install the leaflet package by typing install.packages("leaflet"). Again, we retrieve the coordinates for the GeoCampus Lankwitz using the geocode_OSM() function from the tmaptools package. In this case the extracted lat and lon coordinates are already enough to plot the map and we do not need to convert them to a spatial feature.

library(tmaptools)

geocampus <- geocode_OSM(q = "Malteserstrasse 74-100, Berlin", as.data.frame = TRUE)
geocampus_latlon <- geocampus[, c("lat", "lon")]
geocampus_latlon
##       lat      lon
## 1 52.4262 13.35803

We then create a map widget using the leaflet() function. Note that leaflet maps are commonly produced using the pipe operator %>% of the magrittr package. This makes our code much more readable and allows us to gradually add elements to our map.

library(leaflet)

img <- "http://www.osa.fu-berlin.de/geographie_bsc/_medien/bilder_geocampus/bild_010/010_546.jpeg"

geocampus_latlon %>%
  leaflet() %>%
  addProviderTiles("Esri.WorldImagery") %>%
  addMarkers(~lon, ~lat, label = "GeoCampus Lankwitz", popup = paste0("<img src = ", img, ">"))

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.