The sp
package provides classes and methods
for spatial data analysis and utility functions for plotting maps,
spatial selection, as well as methods for retrieving coordinates, for
subsetting, print, summary, etc. Many other packages have become
dependent on the data representation provided by the sp
package.
library(sp)
The foundation class of the sp
package is the
Spatial
class with two slots. The first is a bounding box, a
matrix of numerical coordinates. The second is a CRS
class
object defining the coordinate reference system.
By calling getClass()
we can inspect the complete
definition of a class, including its slot names and the types of their
contents:
getClass("Spatial")
## Class "Spatial" [package "sp"]
##
## Slots:
##
## Name: bbox proj4string
## Class: matrix CRS
##
## Known Subclasses:
## Class "SpatialPoints", directly
## Class "SpatialMultiPoints", directly
## Class "SpatialGrid", directly
## Class "SpatialLines", directly
## Class "SpatialPolygons", directly
## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2
## Class "SpatialPixels", by class "SpatialPoints", distance 2
## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2
## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2
## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3
## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2
SpatialPoints
classThe SpatialPoints
class is the first subclass of
Spatial
. It extends the Spatial
class by
adding a coords
slot for a matrix of point coordinates.
getClass("SpatialPoints")
## Class "SpatialPoints" [package "sp"]
##
## Slots:
##
## Name: coords bbox proj4string
## Class: matrix matrix CRS
##
## Extends: "Spatial"
##
## Known Subclasses:
## Class "SpatialPointsDataFrame", directly
## Class "SpatialPixels", directly
## Class "SpatialPixelsDataFrame", by class "SpatialPixels", distance 2
Let us create a SpatialPoints
class object by ourselves.
For this we need point coordinates and have to decide on the coordinate
reference system of our data. For the first task we make use of the
geocode_OSM()
function in the tmaptools()
package. This function
allows us to geocode a location (find its latitude and
longitude) using Nominatim.
Let us retrieve the coordinates for the GeoCampus Lankwitz using the
geocode_OSM()
function and the Nominatim API. Note that we
specifically save the output as a data frame, so that we can extract
only the lat and lon coordinates easily.
library(tmaptools)
geocampus <- geocode_OSM(q = "Malteserstrasse 74-100, Berlin", as.data.frame = TRUE)
geocampus_latlon <- geocampus[, c("lon", "lat")]
geocampus_latlon
## lon lat
## 1 13.35803 52.42625
Nice! In the next step we have to decide on a coordinate reference system (CRS). In general the CRS may be defined in several ways, for example as Well-known text (WKT) format, or JSON format, or GML format, or in the PROJ format, among many others.
The PROJ, formerly Proj4, format is a generic, string-based description of a CRS. It defines projection types and parameter values for particular projections. For instance the PROJ format string for the European Terrestrial Reference System 1989 (ETRS89) is \(\text {+proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no\_defs}\).
With respect to the enormous amount of existing CRS the International Association of Oil & Gas Producers (IOGP), formerly known as European Petroleum Survey Group (EPSG), built a collection of definitions for global, regional, national and local coordinate reference systems and coordinate transformations, the EPSG Geodetic Parameter Dataset. Within this collection each particular coordinate reference systems gets an unique integer identifier, commonly denoted as EPSG. For instance, the EPSG identifier for the ETRS89 is simply \(\mathtt 4258\) (for other commonly used representations of ETRS89 go here or here).
In R we may either provide Spatial
objects the PROJ
format string (proj4string
) or simply the EPSG identifier.
In our example we use the EPSG identifier \(4326\), which corresponds to the latest
revision of the World Geodetic System (WGS 84).
The CRS()
function of the sp
package
transforms the character string into a coordinate reference system
(CRS
) object.
latlong <- CRS("+init=epsg:4326")
geocampus_sp <- SpatialPoints(geocampus_latlon,
proj4string = latlong
)
geocampus_sp
## SpatialPoints:
## lon lat
## [1,] 13.35803 52.42625
## Coordinate Reference System (CRS) arguments: +proj=longlat +datum=WGS84
## +no_defs
Without a geographical context geographical coordinates are difficult
to understand, thus we plot the location of the GeoCampus Lankwitz by
making use of the mapview
package.
library(mapview)
mapview(geocampus_sp,
map.types = "Esri.WorldImagery",
label = "GeoCampus Lankwitz"
)
Well, the geocoding did quite a reasonable job!
We can save the geocampus_sp
object for later usage.
save(geocampus_sp, file = "berlin_geocampus.RData")
SpatialPointsDataFrame
classAnother Spatial
class, the
SpatialPointsDataFrame
, extends the
SpatialPoints
class. The
SpatialPointsDataFrame
class acts like a container that
allows us to store additional information, similar to the concept of an
attribute table in a GIS.
getClass("SpatialPointsDataFrame")
## Class "SpatialPointsDataFrame" [package "sp"]
##
## Slots:
##
## Name: data coords.nrs coords bbox proj4string
## Class: data.frame numeric matrix matrix CRS
##
## Extends:
## Class "SpatialPoints", directly
## Class "Spatial", by class "SpatialPoints", distance 2
## Class "SpatialVector", by class "SpatialPoints", distance 2
##
## Known Subclasses:
## Class "SpatialPixelsDataFrame", directly, with explicit coerce
The data
slot is where the additional information is
kept in form of a data.frame
object.
Let us build a SpatialPointsDataFrame
object. This time,
however, we do not build it from scratch, but we read in an ESRI
shapefile with point information.
The shapefile osm_pois_p.shp
stores points of
interest in Berlin provided by OpenStreetMap (OSM). The data was
downloaded from GEOFABRIK on June 25, 2017, and contains
OpenStreetMap data as of June 22, 2017 (see here).
# build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "osm_pois_p.zip"
## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file(s)
unzip(temp)
## close file connection
unlink(temp)
library(terra)
## read in the data
pois <- vect("osm_pois_p.shp")
Several methods are available to access the values of the slots of a
SpatVector
class object.
The ext()
method returns the spatial extent of the
object.
ext(pois)
## SpatExtent : 13.0840658, 13.7579854, 52.3379356, 52.6695062 (xmin, xmax, ymin, ymax)
To retrieve the projection string of the vector object we use the
crs()
function with the option
proj = TRUE
.
crs(pois, proj = T)
## [1] "+proj=longlat +datum=WGS84 +no_defs"
The str()
functions shows the structure of the
SpatVector
class object.
str(pois)
## S4 class 'SpatVector' [package "terra"]
The summary()
method shows statistics of the vector
object’s attributes.
summary(pois)
## osm_id code fclass name
## Length:49091 Min. :2001 Length:49091 Length:49091
## Class :character 1st Qu.:2301 Class :character Class :character
## Mode :character Median :2512 Mode :character Mode :character
## Mean :2490
## 3rd Qu.:2901
## Max. :2964
We can access additional information in the data
by
using the head()
function.
head(pois, 15)
## osm_id code fclass name
## 1 16541597 2907 camera_surveillance Aral
## 2 21677342 2701 tourist_info <NA>
## 3 26735749 2301 restaurant Aida
## 4 26735753 2006 telephone <NA>
## 5 26735759 2301 restaurant Madame Ngo
## 6 26735763 2301 restaurant Sakana
## 7 26754448 2701 tourist_info <NA>
## 8 26865440 2307 biergarten Spinnerbrücke
## 9 26867409 2031 recycling_glass <NA>
## 10 26882423 2503 kiosk Aral Tankstelle
## 11 26972366 2724 memorial Konrad Zuse
## 12 27318009 2307 biergarten Loretta
## 13 27378775 2002 fire_station Freiwillige Feuerwehr Bohnsdorf
## 14 27431777 2724 memorial Luftbrückendenkmal
## 15 28968292 2601 bank Berliner Volksbank eG
The terra
package allows us to work on vector data as if
they were a data.frame
. Consequently, we can apply methods
on that data.frame
object. Let us count the number of
points of interest corresponding to surveillance cameras.
sum(pois$fclass == "camera_surveillance")
## [1] 1320
We can easily subset our SpatVector
object to include
only points which are referenced as surveillance cameras.
surveillance_spdf <- pois[pois$fclass == "camera_surveillance", ]
summary(surveillance_spdf)
## osm_id code fclass name
## Length:1320 Min. :2907 Length:1320 Length:1320
## Class :character 1st Qu.:2907 Class :character Class :character
## Mode :character Median :2907 Mode :character Mode :character
## Mean :2907
## 3rd Qu.:2907
## Max. :2907
The generic plot()
function gives us a visualization of
the data.
plot(surveillance_spdf, main = "Locations of camera surveillance in Berlin")
Let us contextualize the data by plotting it onto an interactive web-map.
mapview(surveillance_spdf)
In addition to the classes discussed so far, the sp
package provides many more classes:
SpatialPoints
,
SpatialMultiPoints
, SpatialPointsDataFrame
and
SpatialMultiPointsDataFrame
objects.SpatialLines
and
SpatialLinesDataFrame
objects.SpatialPolygons
and
SpatialPolygonsDataFrame
objects.SpatialPixels
, SpatialPixelsDataFrame
,
SpatialGrid
and SpatialGridDataFrame
objects.For more details on these classes and on the sp
package
in general, refer to the documentation on CRAN, the
vignettes (here, here and here) or the book of Bivand et
al. (2013).
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.