R includes several functions that can be used for reading, visualizing, and analyzing spatial data. However, only the contribution of the open-source software community makes R such an extremely powerful analysis and visualization tool for spatial data. CRAN Task View - Analysis of Spatial Data lists and describes many of those contributed packages. In the subsequent sections, we deal primarily with the following packages:
terra for input/output operations,
coordinate transformations, as well as advanced handling and
visualization of raster and vector datasf for handling spatial vector
datarastervis for advanced handling and
visualization of raster dataspatstat for point pattern
analysisgstat for geostatisticsmapview for interactive web
mappingPlease be aware that the methods we apply in the subsequent sections may also be implemented in other packages. Depending on your particular application it might be useful to look out for other packages.
terra packageThe terra package provides bindings to the
Geospatial Data Abstraction Library (GDAL) and access to projection and transformation
operations from the PROJ
library. This software is the basis for spatial software applications
and geodata analysis within R and is the successor of the
raster package. Its bindings allow fast I/O (input/output)
operations for GDAL-supported raster formats, OGR-supported vector formats and numerous utility
functions to convert geographic longitude and latitude coordinates into
Cartesian coordinates (and vice versa). Furthermore, the
terra package simplifies the access to large raster data
sets and extends the analytic tools available for both raster and vector
data. Additionally, the rasterVis package provides enhanced
visualization opportunities.
SpatVector
classVector data is handled within the terra package over the
SpatVector class representation. Since we need to deal with
vector-based spatial data for point pattern analysis, we want to
introduce some basic programming concepts using the terra
package.
Firstly, let us create a SpatVector class object by
ourselves. Therefore, we need point coordinates and to decide on the
coordinate reference system for our data. For the first task we use the
geocode_OSM() [function]((https://rdrr.io/cran/tmaptools/man/geocode_OSM.html) in
the tmaptools() package. This function allows us to
geocode a location (i.e., to find its latitude and longitude)
using the OpenStreetMap-API.
Exercise: Retrieve the coordinates for the GeoCampus Lankwitz using the
geocode_OSM(...,projection = "+init=epsg:4326",...)function for the OSM API. Hint: You will get a list with three entries. Extract the xy-coordinates and transform them into a data frame with “long” and “lat” column headers.
# your code here
geocampus <- NULL
library(tmaptools)
geocampus <- geocode_OSM("Malteserstrasse 74-100, Berlin,Germany",
projection = "+init=epsg:4326",
return.first.only = TRUE,
details = FALSE,
as.data.frame = NA,
as.sf = FALSE,
server = "http://nominatim.openstreetmap.org"
)
geocampus <- as.data.frame(t(as.matrix(geocampus$coords)))
colnames(geocampus) <- c("long", "lat")
geocampus
## long lat
## 1 13.35803 52.42625
Nice!
In order to convert the string-based coordinates as a
data.frame representation to a SpatVector
object, we have to add a coordinate reference system (CRS). In general,
a CRS can be defined in several ways, for example as Well-known text (WKT) format, JSON
format, GML format or in the Proj4 format, among many others.
In R, we use the Proj4 format for spatial objects. This
format is a generic, string-based description of a CRS, understood by
the proj.4 library. It defines projection types and defines
parameter values for particular projections. For instance, the
Proj4 format string for the European Terrestrial Reference System 1989 (ETRS89)
is \[\mathtt {+proj=longlat +ellps=GRS80
+towgs84=0,0,0,0,0,0,0 +no\_defs }.\]
With respect to the enormous number of existing CRSs, it is impossible to remember all of those specifications. For historical reasons the International Association of Oil & Gas Producers (IOGP), formerly known as European Petroleum Survey Group (EPSG), built a collection of definitions for global, regional, national and local coordinate reference systems and coordinate transformations, the EPSG Geodetic Parameter Dataset. Each coordinate reference system gets a unique integer identifier within this collection, commonly denoted as EPSG. For instance, the EPSG identifier for the ETRS89 is simply \(\mathtt 4258\) (for other commonly used representations of ETRS89 go here or here).
To conclude, we feed the Spatial objects with the
Proj4 format string (proj4string) or simply
provide the EPSG identifier. For our example, we use the EPSG identifier
\(4326\), which
corresponds to the latest revision of the World Geodetic System (WGS84).
The function crs() of the terra package
transforms the character string into a coordinate reference
system (crs) object.
library(terra)
latlong <- crs("+init=epsg:4326")
geocampus_vect <- vect(geocampus,
geom = c("long", "lat"),
crs = latlong)
geocampus_vect
## class : SpatVector
## geometry : points
## dimensions : 1, 0 (geometries, attributes)
## extent : 13.35803, 13.35803, 52.42625, 52.42625 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
Geographic coordinates are difficult to understand without a
geographical context. Thus, we plot the location of the GeoCampus
Lankwitz using the mapview package.
library(mapview)
library(sf)
img <- "http://www.osa.fu-berlin.de/geographie_bsc/_medien/bilder_geocampus/bild_010/010_546.jpeg"
mapview(sf::st_as_sf(geocampus_vect),
map.types = "Esri.WorldImagery",
popup = leafpop::popupImage(img, src = "remote")
)
Well, the geocoding did a quite reasonable job!
SpatVector
objectsFurthermore, the SpatVector class allows us to store
additional information, similar to the concept of an attribute
table in a GIS.
The additional information is kept in the spatial data representation
as data.frame. To showcase the data handling, we read in an
ESRI shapefile with point information.
The shapefile gis_osm_pois_free_1.shp stores points
of interests in Berlin provided by OpenStreetMap (OSM). The data was
downloaded from GEOFABRIK on August 27, 2022, and contains
OpenStreetMap data as of August 26, 2022 (see here).
# build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "berlin-latest-free.shp.zip"
## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file(s)
unzip(temp)
## close file connection
unlink(temp)
library(terra)
pois <- vect("gis_osm_pois_free_1.shp")
Several methods are available to access the associated data within
the spatial SpatVector object. Firstly, over the
ext() function the object’s spatial extent is
retrieved.
ext(pois)
## SpatExtent : 13.0840658, 13.7602896, 52.3379366, 52.6726975 (xmin, xmax, ymin, ymax)
The str() function shows the structure of the
Spatial class object.
str(pois)
## S4 class 'SpatVector' [package "terra"]
The summary() method shows the bounding box, whether the
object is projected, and the number of rows of coordinates.
summary(pois)
## osm_id code fclass name
## Length:83123 Min. :2001 Length:83123 Length:83123
## Class :character 1st Qu.:2303 Class :character Class :character
## Mode :character Median :2701 Mode :character Mode :character
## Mean :2604
## 3rd Qu.:2902
## Max. :2964
We can assess the object’s attribute data via data.frame
syntax, for example over the $ sign.
Exercise: List the first 15 entries of the attribute table stored in the
dataslot using theheadfunction.
# your code here
head(pois, 15)
## osm_id code fclass name
## 1 16541597 2907 camera_surveillance Aral
## 2 26735749 2301 restaurant Aida
## 3 26735753 2006 telephone <NA>
## 4 26735759 2301 restaurant Madame Ngo
## 5 26735763 2301 restaurant Thanh Long
## 6 26754448 2701 tourist_info <NA>
## 7 26865440 2307 biergarten Spinnerbrücke
## 8 26867409 2031 recycling_glass <NA>
## 9 26972366 2724 memorial Konrad Zuse
## 10 27001739 2907 camera_surveillance <NA>
## 11 27318009 2307 biergarten Loretta
## 12 27378775 2002 fire_station Freiwillige Feuerwehr Bohnsdorf
## 13 27431777 2724 memorial Luftbrückendenkmal
## 14 27431777 2725 artwork Luftbrückendenkmal
## 15 28968292 2601 bank Berliner Volksbank
The str() function reveals that we are actually dealing
with a data.frame object.
str(pois)
## S4 class 'SpatVector' [package "terra"]
Consequently we can apply methods on that data.frame
object.
Exercise: Count the number of points of interest corresponding to surveillance cameras.
# your code here
sum(pois$fclass == "camera_surveillance")
## [1] 2755
Exercise: Subset the
SpatVectorobject to only include points which are surveillance cameras.
# your code here
surveillance_spdf <- NULL
summary(surveillance_spdf)
surveillance_spdf <- pois[pois$fclass == "camera_surveillance", ]
summary(surveillance_spdf)
## osm_id code fclass name
## Length:2755 Min. :2907 Length:2755 Length:2755
## Class :character 1st Qu.:2907 Class :character Class :character
## Mode :character Median :2907 Mode :character Mode :character
## Mean :2907
## 3rd Qu.:2907
## Max. :2907
The generic plot() function, internally linked to the
terra package association when a SpatVector
object is given as argument, provides a visualization of the data.
plot(surveillance_spdf)
Let us contextualize the data by plotting it onto an interactive web
map. For this purpose, a typecast to a sf object is
required.
mapview(sf::st_as_sf(surveillance_spdf))
terra and
rasterVISTo showcase some of the features provided by the terra
and rasterVis packages in terms of handling, manipulation
and visualization of raster data, we download the SRTM digital elevation model (DEM) for Germany. The data
was retrieved from OPENDEM on November 07, 2022. Please note the zipped
file is about 90 MB in size.
## build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "srtm_germany_dtm.zip"
## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file
unzip(temp)
unlink(temp)
To load the raster data set we apply the rast() function
provided by the terra package.
# read the data
srtm <- rast("srtm_germany_dtm.tif")
Printing the SpatRaster object summarizes the raster
data set.
srtm
## class : SpatRaster
## dimensions : 10801, 13201, 1 (nrow, ncol, nlyr)
## resolution : 0.0008333333, 0.0008333333 (x, y)
## extent : 4.999583, 16.00042, 46.99958, 56.00042 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
## source : srtm_germany_dtm.tif
## name : srtm_germany_dtm
When working with large data sets, checking the file size in advance is recommended. The processing time of large data may be quite long.
fs <- file.size("srtm_germany_dtm.tif")
print(paste("The file size is", as.numeric(fs) / 1e6, "MB"))
## [1] "The file size is 285.254776 MB"
Owing to the large file size, we downscaled the raster data set using
the aggregate() function. The aggregate()
function performs aggregation by grouping adjacent pixels into new ones,
given an input raster and the aggregation factor fact. For
example, using fact = 5 means that each set of 5 \(\times\) 5 cells in the original raster
becomes a single cell in the new raster. In addition, the
fun argument specifies which function will be used to
calculate the new values based on existing ones (the default is
mean).
Exercise: Apply the
aggregate()function on our SRTM raster, aggregating 5 \(\times\) 5 sets of cells and plot it.
# your code here
srtm_small <- NULL
plot(srtm_small)
srtm_small <- aggregate(srtm, 5, fun = mean)
##
|---------|---------|---------|---------|
=========================================
plot(srtm_small, col = viridisLite::plasma(3500))
We want to clean up the data set and exclude the area that covers the ocean.
Exercise: Set all values in the SRTM data set that are equal to 0 to
NA.
# your code here
srtm_small[srtm_small == 0] <- NA
Now we plot the aggregated and cleaned up data set.
plot(srtm_small,
main = "DEM-SRTM Germany",
col = viridisLite::plasma(3500)
)
Well, still there could be done some further adaptations, but we keep it for now.
Now, we want to subset the raster data set to reflect our area of
interest (AOI). Therefore, we apply the crop() function and
feed it with a SpatVect object, in our case the district
border of the city of Berlin (berlin_vect).
## build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "Berlin_vect.zip"
## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file
unzip(temp)
unlink(temp)
berlin_vect <- vect("Berlin_vect.shp")
srtm_berlin <- crop(srtm, berlin_vect)
We may plot the resulting subset by the generic plot()
function. In order to add additional layers we simply call the
plot() function once again, this time however, we add the
argument add = TRUE to the function call. To showcase this
functionality we plot our AOI (the area of Berlin) and add the a
spatialPoints object, representing the location of the
GeoCampus Lankwitz (geocampus_vect).
plot(srtm_berlin,
main = "DEM-SRTM Berlin and GeoCampus Lankwitz",
col = viridisLite::viridis(100)
)
plot(geocampus_vect, col = "#FF3300", add = TRUE, pch = 16)
Another useful function is the mask() function, which
creates a new Raster object with the same values as the
input Raster object, except for the masked cells, which are
set to NA by default. Therefore, we need to transform the
sp object to the equivalent terra package
representation by using the vect() function.
Exercise: Use the
mask()function to mask out the area outside the district of Berlin. Use the objectssrtm_berlinandberlin_sp_wgs84. Plot the result.
# your code here
srtm_berlin_masked <- NULL
plot(srtm_berlin_masked,
main = "DEM-SRTM Berlin and GeoCampus Lankwitz",
cex.main = 0.7,
col = viridisLite::viridis(100)
)
plot(geocampus_vect, col = "#FF3300", add = TRUE, pch = 16)
srtm_berlin_masked <- mask(srtm_berlin, berlin_vect)
plot(srtm_berlin_masked,
main = "DEM-SRTM Berlin and GeoCampus Lankwitz",
cex.main = 0.7,
col = viridisLite::viridis(10)
)
plot(geocampus_vect, col = "#FF3300", add = TRUE, pch = 16)
The rasterVis package defines additional methods for
spatial raster data access, manipulation and visualization (for more
information on the rasterVis package, go here).
Here, we apply the levelplot() function which is
optimized for displaying raster data and provides some very nice
additional graphic features. Type ??rasterVis::levelplot in
your console for further information.
Exercise: Tweak the arguments of the
levelplot()function to change the visualization.
library(rasterVis)
# your code here
library(rasterVis)
m <- 20
srtm_berlin_reproj <- terra::project(srtm_berlin_masked, "epsg:25832")
levelplot(srtm_berlin_reproj,
margin = FALSE,
contour = TRUE,
at = seq(0, minmax(srtm_berlin)[2], m),
main = paste("DEM-SRTM, contour lines", m, "m"),
par.settings = rasterTheme()
)
Another useful function in the terra package is the
terrain() function. This function computes the slope,
aspect and other terrain characteristics from a raster with elevation
data.
slope <- terrain(srtm_berlin,
v = "slope", neighbors=8,
unit = "degrees")
levelplot(slope,
margin = FALSE,
par.settings = BuRdTheme(),
main = "Slope"
)
The levelpot() function returns a trellis
object, which relates to the lattice graphics package. In order to
overlay different trellis objects we make use of the
layer() function of the latticeExtra package.
For example, let us add the administrative border of Berlin. We overlay
trellis objects with the + sign.
levelplot(slope,
margin = FALSE,
par.settings = BuRdTheme(),
main = "Slope",
panel=function(...){
panel.levelplot(...)
panel.polygon(geom(terra::project(berlin_vect,
crs(slope)))[, c(3,4)])
})
sf packageThe sf package encodes spatial vector data
according to the Simple Features standard described by the Open Geospatial Consortium (OGC) and International Organization for Standardization
(ISO). The sf package provides bindings to GDAL for reading and
writing data, to GEOS for geometrical operations, and to proj.4 for projection
conversions and datum transformations.
The sf package represents simple features as native R
objects, using simple data structures (S3 classes,
lists, matrices, vectors). All functions and methods in sf
that operate on spatial data are prefixed by st_, which
refers to spatial and temporal. There are three
classes used to represent simple features:
sf, a table (data.frame) of feature
attributes and feature geometries which containssfc, the list-column with the geometries for each
feature (record) which is composed ofsfg, the feature geometry of an individual simple
feature.sf is the R package for spatial analysis that gradually
replaced the well known sp package, which retired together
with the rgdal package in October 2023. For more details on
the sf package refer to the documentation on CRAN, and visit the vignettes (here, here, and here).
sf: objects with simple featuresAn sf object is a data set consisting of sets of
features with attributes.
To showcase the functionality of the sf package we
download a polygon data set (ESRI shapefile) and make it available
within the R workspace over st_read() function.
The shapefile stores the geometry of districts in Germany and was downloaded from the Federal Agency for Cartography and Geodesy on October 30, 2022. The data is found here, and the file description is found here (GeoBasis-DE/BKG 2022).
We also want to add statistical information about the area, the population size and the population density of each district. This data was obtained from the The Federal Statistical Office on October 30, 2022 (here).
The procedure of how to download the data and merge it into a single shapefile is described in the code below. The completed file can be downloaded here. If you want to use the completed file directly, download and read it as described for the shapefile in the following code.
# download of the shapefile
temp <- tempfile()
download_url <- "https://daten.gdz.bkg.bund.de/produkte/vg/vg250_ebenen_0101/aktuell/"
zipfile <- "vg250_01-01.utm32s.shape.ebenen.zip"
# download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
# unzip the file(s)
unzip(temp)
# close file connection
unlink(temp)
# download of the population data
download.file(url = "https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/Administrativ/Archiv/Standardtabellen/04_KreiseVorjahr.xlsx?__blob=publicationFile", destfile = "./germany_dist_population.xlsx", mode = "wb")
# read the shapefile for German districts
library(sf)
district_geometry <- st_read("./vg250_01-01.utm32s.shape.ebenen/vg250_ebenen_0101/VG250_KRS.shp", quiet = TRUE)
colnames(district_geometry)[9] <- "district"
# read population and area statistics for German districts
library(xlsx)
district_stats <- read.xlsx2("germany_dist_population.xlsx", 2, header = TRUE)
colnames(district_stats)[c(4, 5, 6, 7, 8, 9)] <- c("NUTS", "area", "pop", "pop_m", "pop_f", "pop_dens")
district_stats[, c(5, 6, 7, 8, 9)] <- as.data.frame(apply(district_stats[, c(5, 6, 7, 8, 9)], 2, as.numeric))
# merge both data frames
total <- merge(district_geometry, district_stats, by = "NUTS")
# select relevant data
germany_districts <- total[, c("district", "area", "pop", "pop_m", "pop_f", "pop_dens", "geometry")]
Now, let us take a closer look at our sf object which
contains the data about the German districts. The class()
function shows that the sf object class extends the
data.frame object class.
class(germany_districts)
## [1] "sf" "data.frame"
colnames(germany_districts)
## [1] "district" "area" "pop" "pop_m" "pop_f" "pop_dens" "geometry"
The column with name geometry holds the geometries, and
the other columns refer to the additional statistical information
available for each record.
The str() function provides more information about the
number of records (430), data types and the geometry attributes.
str(germany_districts)
## Classes 'sf' and 'data.frame': 430 obs. of 7 variables:
## $ district: chr "Stuttgart" "Böblingen" "Esslingen" "Göppingen" ...
## $ area : num 207 618 641 642 687 ...
## $ pop : num 626275 393195 533388 259046 544679 ...
## $ pop_m : num 312886 195513 266348 128916 270057 ...
## $ pop_f : num 313389 197682 267040 130130 274622 ...
## $ pop_dens: num 3021 636 832 403 793 ...
## $ geometry:sfc_MULTIPOLYGON of length 430; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:532, 1:2] 516514 516502 516484 516871 516824 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA
## ..- attr(*, "names")= chr [1:6] "district" "area" "pop" "pop_m" ...
Exercise: Access the first 20 rows of the column in the
data.frameobject that contains the names of the districts of Germany.
# your code here
head(germany_districts$district, 20)
## [1] "Stuttgart" "Böblingen" "Esslingen"
## [4] "Göppingen" "Ludwigsburg" "Rems-Murr-Kreis"
## [7] "Heilbronn" "Heilbronn" "Hohenlohekreis"
## [10] "Schwäbisch Hall" "Main-Tauber-Kreis" "Heidenheim"
## [13] "Ostalbkreis" "Baden-Baden" "Karlsruhe"
## [16] "Karlsruhe" "Rastatt" "Heidelberg"
## [19] "Mannheim" "Neckar-Odenwald-Kreis"
The sf object class offers a vast range of functions and
spatial data operations to be performed.
methods(class = "sf")
## [1] $<- [
## [3] [[<- aggregate
## [5] as.data.frame cbind
## [7] coerce crs
## [9] dbDataType dbWriteTable
## [11] distance duplicated
## [13] ext extract
## [15] filter identify
## [17] initialize lines
## [19] mapView mask
## [21] merge plot
## [23] points polys
## [25] print rasterize
## [27] rbind show
## [29] slotsFromS3 st_agr
## [31] st_agr<- st_area
## [33] st_as_s2 st_as_sf
## [35] st_as_sfc st_bbox
## [37] st_boundary st_break_antimeridian
## [39] st_buffer st_cast
## [41] st_centroid st_collection_extract
## [43] st_concave_hull st_convex_hull
## [45] st_coordinates st_crop
## [47] st_crs st_crs<-
## [49] st_difference st_drop_geometry
## [51] st_filter st_geometry
## [53] st_geometry<- st_inscribed_circle
## [55] st_interpolate_aw st_intersection
## [57] st_intersects st_is
## [59] st_is_valid st_join
## [61] st_line_merge st_m_range
## [63] st_make_valid st_minimum_rotated_rectangle
## [65] st_nearest_points st_node
## [67] st_normalize st_point_on_surface
## [69] st_polygonize st_precision
## [71] st_reverse st_sample
## [73] st_segmentize st_set_precision
## [75] st_shift_longitude st_simplify
## [77] st_snap st_sym_difference
## [79] st_transform st_triangulate
## [81] st_triangulate_constrained st_union
## [83] st_voronoi st_wrap_dateline
## [85] st_write st_z_range
## [87] st_zm svc
## [89] transform vect
## see '?methods' for accessing help and source code
Let us try the plot() function on the sf
object germany_districts:
plot(germany_districts)
By default the plot() function plots up to 9 features.
We can change that behavior by setting the max.plot
argument.
plot(germany_districts, max.plot = 2)
Or we explicitly state which column we want to plot:
plot(germany_districts["district"])
Owing to the simplicity of the sf object structure we
can easily subset and manipulate our data.
Exercise: Subset the data set to only include the geometry of the city of Berlin and plot it thereafter.
# your code here
berlin_sf <- germany_districts["district"][germany_districts$district == "Berlin", ]
berlin_sf
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 777974.1 ymin: 5808837 xmax: 823510.5 ymax: 5845580
## Projected CRS: ETRS89 / UTM zone 32N
## district geometry
## 142 Berlin MULTIPOLYGON (((802831.7 58...
plot(berlin_sf, main = "Berlin", col = NA)
dplyr package for data
manipulationExercise: Plot the districts of Germany such that only those districts are filled with colour where the difference between male (
pop_m) and female (pop_f) population is larger than 3 % of the overall population.
# replace the NULL values within the following code accordingly
p <- germany_districts %>%
# calculate the absolute difference between male and female population
dplyr::mutate(NULL) %>%
# calculate the percentage with respect to the total population
dplyr::mutate(relative_difference = NULL) %>%
# select desired column
dplyr::select(NULL) %>%
# filter districts according to threshold of 3%
dplyr::filter(NULL)
# plot
library(viridis)
p %>%
ggplot2::ggplot() +
ggplot2::geom_sf(data = germany_districts, color = "grey") +
ggplot2::geom_sf(ggplot2::aes(fill = relative_difference)) +
scale_fill_viridis("relative_difference", direction = -1, option = "plasma") +
ggplot2::theme_bw()
p <- germany_districts %>%
# calculate the absolute difference between male and female population
dplyr::mutate(abs_diff = abs(pop_m - pop_f)) %>%
# calculate the percentage with respect to the total population
dplyr::mutate(relative_difference = abs_diff / pop * 100) %>%
# select desired column
dplyr::select(relative_difference) %>%
# filter districts according to threshold of 3%
dplyr::filter(relative_difference > 3)
# plot
library(viridis)
p %>%
ggplot2::ggplot() +
ggplot2::geom_sf(data = germany_districts, color = "grey") +
ggplot2::geom_sf(ggplot2::aes(fill = relative_difference)) +
scale_fill_viridis("Relative Difference in %", direction = -1, option = "plasma") +
ggplot2::theme_bw()
sfc: simple feature geometry
list-columnThe geometry column in an sf object stores the
geometries. It may be accessed by $geometry, but the more
general way is to use the st_geometry() function.
germany_districts_geom <- st_geometry(germany_districts)
class(germany_districts_geom)
## [1] "sfc_MULTIPOLYGON" "sfc"
germany_districts_geom
## Geometry set for 430 features
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 280371.1 ymin: 5235856 xmax: 921292.4 ymax: 6101487
## Projected CRS: ETRS89 / UTM zone 32N
## First 5 geometries:
## MULTIPOLYGON (((516514.1 5412585, 516501.7 5412...
## MULTIPOLYGON (((495454.7 5412605, 495762.8 5412...
## MULTIPOLYGON (((529621.7 5403177, 529823.7 5402...
## MULTIPOLYGON (((543224.2 5403286, 543489.1 5403...
## MULTIPOLYGON (((509140.4 5434509, 509292.2 5434...
The sfc object class offers a vast range of functions
and operations for spatial data.
methods(class = "sfc")
## [1] [ [<-
## [3] as.data.frame c
## [5] coerce format
## [7] identify initialize
## [9] mapView Ops
## [11] print rep
## [13] show slotsFromS3
## [15] st_area st_as_binary
## [17] st_as_grob st_as_s2
## [19] st_as_sf st_as_text
## [21] st_bbox st_boundary
## [23] st_break_antimeridian st_buffer
## [25] st_cast st_centroid
## [27] st_collection_extract st_concave_hull
## [29] st_convex_hull st_coordinates
## [31] st_crop st_crs
## [33] st_crs<- st_difference
## [35] st_geometry st_inscribed_circle
## [37] st_intersection st_intersects
## [39] st_is st_is_valid
## [41] st_line_merge st_m_range
## [43] st_make_valid st_minimum_rotated_rectangle
## [45] st_nearest_points st_node
## [47] st_normalize st_point_on_surface
## [49] st_polygonize st_precision
## [51] st_reverse st_sample
## [53] st_segmentize st_set_precision
## [55] st_shift_longitude st_simplify
## [57] st_snap st_sym_difference
## [59] st_transform st_triangulate
## [61] st_triangulate_constrained st_union
## [63] st_voronoi st_wrap_dateline
## [65] st_write st_z_range
## [67] st_zm str
## [69] summary vec_cast.sfc
## [71] vec_ptype2.sfc vect
## see '?methods' for accessing help and source code
For instance, the function st_bbox() retrieves the
coordinate bounding box.
st_bbox(germany_districts)
## xmin ymin xmax ymax
## 280371.1 5235856.0 921292.4 6101486.9
sfg: simple feature geometrySimple feature geometry (sfg) objects carry the geometry
for a single feature, e.g. a point, linestring or polygon.
Simple feature geometries are implemented as R native data, using the following rules:
The coordinate reference system (CRS) specifies the location on Earth
of a particular pair of coordinates. The sfc object
(geometry list-columns) has two attributes for storing a CRS:
epsg and proj4string.
st_crs(germany_districts)
## Coordinate Reference System:
## User input: ETRS89 / UTM zone 32N
## wkt:
## PROJCRS["ETRS89 / UTM zone 32N",
## BASEGEOGCRS["ETRS89",
## ENSEMBLE["European Terrestrial Reference System 1989 ensemble",
## MEMBER["European Terrestrial Reference Frame 1989"],
## MEMBER["European Terrestrial Reference Frame 1990"],
## MEMBER["European Terrestrial Reference Frame 1991"],
## MEMBER["European Terrestrial Reference Frame 1992"],
## MEMBER["European Terrestrial Reference Frame 1993"],
## MEMBER["European Terrestrial Reference Frame 1994"],
## MEMBER["European Terrestrial Reference Frame 1996"],
## MEMBER["European Terrestrial Reference Frame 1997"],
## MEMBER["European Terrestrial Reference Frame 2000"],
## MEMBER["European Terrestrial Reference Frame 2005"],
## MEMBER["European Terrestrial Reference Frame 2014"],
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]],
## ENSEMBLEACCURACY[0.1]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4258]],
## CONVERSION["UTM zone 32N",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",9,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Europe between 6°E and 12°E: Austria; Belgium; Denmark - onshore and offshore; Germany - onshore and offshore; Norway including - onshore and offshore; Spain - offshore."],
## BBOX[38.76,6,84.33,12]],
## ID["EPSG",25832]]
CRS transformation operations for an sf object are
carried out using the st_transform() function.
We want to project our sf object of Berlin
(berlin_sf) onto the European Terrestrial Reference System 1989 (ETRS89/UTM
zone 32N). Therefore, we also need provide the EPSG identifier for
ERTS89/UTM zone 32N, which is \(25832\)
(see here)
berlin_sf <- st_transform(berlin_sf, 25832)
st_crs(berlin_sf)
## Coordinate Reference System:
## User input: EPSG:25832
## wkt:
## PROJCRS["ETRS89 / UTM zone 32N",
## BASEGEOGCRS["ETRS89",
## ENSEMBLE["European Terrestrial Reference System 1989 ensemble",
## MEMBER["European Terrestrial Reference Frame 1989"],
## MEMBER["European Terrestrial Reference Frame 1990"],
## MEMBER["European Terrestrial Reference Frame 1991"],
## MEMBER["European Terrestrial Reference Frame 1992"],
## MEMBER["European Terrestrial Reference Frame 1993"],
## MEMBER["European Terrestrial Reference Frame 1994"],
## MEMBER["European Terrestrial Reference Frame 1996"],
## MEMBER["European Terrestrial Reference Frame 1997"],
## MEMBER["European Terrestrial Reference Frame 2000"],
## MEMBER["European Terrestrial Reference Frame 2005"],
## MEMBER["European Terrestrial Reference Frame 2014"],
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]],
## ENSEMBLEACCURACY[0.1]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4258]],
## CONVERSION["UTM zone 32N",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",9,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Europe between 6°E and 12°E: Austria; Belgium; Denmark - onshore and offshore; Germany - onshore and offshore; Norway including - onshore and offshore; Spain - offshore."],
## BBOX[38.76,6,84.33,12]],
## ID["EPSG",25832]]
For later usage we store the geometry of the district Berlin on disk.
save(file = "berlin_district.RData", berlin_sf)
Now it’s your turn!
Exercise: Transform the data set
germany_districtsinto WGS 84/Pseudo-Mercator, the so called Google Web Mercator. The EPSG identifier is \(3857\).
# your code here
germany_districts <- st_transform(germany_districts, 3857)
st_crs(germany_districts)
## Coordinate Reference System:
## User input: EPSG:3857
## wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
## BASEGEOGCRS["WGS 84",
## ENSEMBLE["World Geodetic System 1984 ensemble",
## MEMBER["World Geodetic System 1984 (Transit)"],
## MEMBER["World Geodetic System 1984 (G730)"],
## MEMBER["World Geodetic System 1984 (G873)"],
## MEMBER["World Geodetic System 1984 (G1150)"],
## MEMBER["World Geodetic System 1984 (G1674)"],
## MEMBER["World Geodetic System 1984 (G1762)"],
## MEMBER["World Geodetic System 1984 (G2139)"],
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ENSEMBLEACCURACY[2.0]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]],
## CONVERSION["Popular Visualisation Pseudo-Mercator",
## METHOD["Popular Visualisation Pseudo Mercator",
## ID["EPSG",1024]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Web mapping and visualisation."],
## AREA["World between 85.06°S and 85.06°N."],
## BBOX[-85.06,-180,85.06,180]],
## ID["EPSG",3857]]
We can use the coord_sf() function and plot
sf objects with different projection.
For example, let us plot our data set in the WGS84/World Equidistant Cylindrical projection (EPSG 4087).
library(ggplot2)
library(viridis)
ggplot(germany_districts) +
geom_sf(aes(fill = pop)) +
scale_fill_viridis("pop") +
coord_sf(crs = st_crs(4087)) +
ggtitle("World Equidistant Cylindrical (EPSG 4087)") +
theme_bw()
Frequently used in Germany is the Gauß-Krüger projection(also known as Transverse Mercator projection).
Exercise: Plot the dataset
germany_districtsin DHDN / 3-degree Gauss-Kruger zone 3 transformation (EPSG 31467).
# your code here
ggplot(germany_districts) +
geom_sf(aes(fill = pop)) +
scale_fill_viridis("pop", direction = -1, option = "mako") +
coord_sf(crs = st_crs(31467)) +
ggtitle("DHDN / 3-degree Gauss-Kruger zone 3 (EPSG 31467)") +
theme_bw()
A nice look up page for different coordinate systems is found here and a fancy visualization of many prominent map projections is found here.
spatstat packageThe R package spatstat supports statistical analysis for
spatial point patterns. Its main focus is the analysis of spatial
patterns of points in two-dimensional space. The points may
carry auxiliary data (marks), and the spatial region
in which the points were recorded may have arbitrary shape.
The package supports
An introduction to spatstat is provided by the package
vignette Getting started with spatstat.
For further information visit the project on CRAN, the project homepage or dig
through the over 1500-paged reference manual.
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.