The sp package provides classes and methods for spatial data analysis and utility functions for plotting maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc. Many other packages have become dependent on the data representation provided by the sp package.

library(sp)

The foundation class of the sp package is the Spatial class with two slots. The first is a bounding box, a matrix of numerical coordinates. The second is a CRS class object defining the coordinate reference system.

By calling getClass() we can inspect the complete definition of a class, including its slot names and the types of their contents:

getClass("Spatial")
## Class "Spatial" [package "sp"]
## 
## Slots:
##                               
## Name:         bbox proj4string
## Class:      matrix         CRS
## 
## Known Subclasses: 
## Class "SpatialPoints", directly
## Class "SpatialMultiPoints", directly
## Class "SpatialGrid", directly
## Class "SpatialLines", directly
## Class "SpatialPolygons", directly
## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2
## Class "SpatialPixels", by class "SpatialPoints", distance 2
## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2
## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2
## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3
## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2

SpatialPoints class

The SpatialPoints class is the first subclass of Spatial. It extends the Spatial class by adding a coords slot for a matrix of point coordinates.

getClass("SpatialPoints")
## Class "SpatialPoints" [package "sp"]
## 
## Slots:
##                                           
## Name:       coords        bbox proj4string
## Class:      matrix      matrix         CRS
## 
## Extends: "Spatial"
## 
## Known Subclasses: 
## Class "SpatialPointsDataFrame", directly
## Class "SpatialPixels", directly
## Class "SpatialPixelsDataFrame", by class "SpatialPixels", distance 2

Let us create a SpatialPoints class object by ourselves. For this we need point coordinates and have to decide on the coordinate reference system of our data. For the first task we make use of the geocode_OSM() function in the tmaptools() package. This function allows us to geocode a location (find its latitude and longitude) using Nominatim.

Let us retrieve the coordinates for the GeoCampus Lankwitz using the geocode_OSM() function and the Nominatim API. Note that we specifically save the output as a data frame, so that we can extract only the lat and lon coordinates easily.

library(tmaptools)

geocampus <- geocode_OSM(q = "Malteserstrasse 74-100, Berlin", as.data.frame = TRUE)
geocampus_latlon <- geocampus[, c("lon", "lat")]
geocampus_latlon
##        lon     lat
## 1 13.35803 52.4262

Nice! In the next step we have to decide on a coordinate reference system (CRS). In general the CRS may be defined in several ways, for example as Well-known text (WKT) format, or JSON format, or GML format, or in the PROJ format, among many others.

The PROJ, formerly Proj4, format is a generic, string-based description of a CRS. It defines projection types and parameter values for particular projections. For instance the PROJ format string for the European Terrestrial Reference System 1989 (ETRS89) is \(\text {+proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no\_defs}\).

With respect to the enormous amount of existing CRS the International Association of Oil & Gas Producers (IOGP), formerly known as European Petroleum Survey Group (EPSG), built a collection of definitions for global, regional, national and local coordinate reference systems and coordinate transformations, the EPSG Geodetic Parameter Dataset. Within this collection each particular coordinate reference systems gets an unique integer identifier, commonly denoted as EPSG. For instance, the EPSG identifier for the ETRS89 is simply \(\mathtt 4258\) (for other commonly used representations of ETRS89 go here or here).

In R we may either provide Spatial objects the PROJ format string (proj4string) or simply the EPSG identifier. In our example we use the EPSG identifier \(4326\), which corresponds to the latest revision of the World Geodetic System (WGS 84).

The CRS() function of the sp package transforms the character string into a coordinate reference system (CRS) object.

latlong <- CRS("+init=epsg:4326")
geocampus_sp <- SpatialPoints(geocampus_latlon,
  proj4string = latlong
)
geocampus_sp
## SpatialPoints:
##           lon     lat
## [1,] 13.35803 52.4262
## Coordinate Reference System (CRS) arguments: +proj=longlat +datum=WGS84
## +no_defs

Without a geographical context geographical coordinates are difficult to understand, thus we plot the location of the GeoCampus Lankwitz by making use of the mapview package.

library(mapview)

mapview(geocampus_sp,
  map.types = "Esri.WorldImagery",
  label = "GeoCampus Lankwitz"
)

Well, the geocoding did quite a reasonable job!

We can save the geocampus_sp object for later usage.

save(geocampus_sp, file = "berlin_geocampus.RData")

SpatialPointsDataFrame class

Another Spatial class, the SpatialPointsDataFrame, extends the SpatialPoints class. The SpatialPointsDataFrame class acts like a container that allows us to store additional information, similar to the concept of an attribute table in a GIS.

getClass("SpatialPointsDataFrame")
## Class "SpatialPointsDataFrame" [package "sp"]
## 
## Slots:
##                                                                   
## Name:         data  coords.nrs      coords        bbox proj4string
## Class:  data.frame     numeric      matrix      matrix         CRS
## 
## Extends: 
## Class "SpatialPoints", directly
## Class "Spatial", by class "SpatialPoints", distance 2
## Class "SpatialVector", by class "SpatialPoints", distance 2
## 
## Known Subclasses: 
## Class "SpatialPixelsDataFrame", directly, with explicit coerce

The data slot is where the additional information is kept in form of a data.frame object.

Let us build a SpatialPointsDataFrame object. This time, however, we do not build it from scratch, but we read in an ESRI shapefile with point information.

The shapefile osm_pois_p.shp stores points of interest in Berlin provided by OpenStreetMap (OSM). The data was downloaded from GEOFABRIK on June 25, 2017, and contains OpenStreetMap data as of June 22, 2017 (see here).

# build a temporary folder on disk
temp <- tempfile()
download_url <- "https://userpage.fu-berlin.de/soga/data/raw-data/spatial/"
zipfile <- "osm_pois_p.zip"

## download the file
download.file(paste0(download_url, zipfile), temp, mode = "wb")
## unzip the file(s)
unzip(temp)
## close file connection
unlink(temp)
library(rgdal)
## read in the data
pois <- readOGR(
  dsn = ".", "osm_pois_p",
  use_iconv = TRUE, # enable encoding
  encoding =