The SRTM digital elevation model (DEM) for Germany was retrieved from OPENDEM on June 25, 2017. Please note that the zipped file is about 86.4 MB in size.
# First, let's import the needed libraries.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
import requests, zipfile, io, os
url = "http://www.userpage.fu-berlin.de/~soga/300/30100_data_sets/spatial/srtm_germany_dtm.zip"
r = requests.get(url)
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall("../data")
To load the raster data set we apply the rasterio
package providing open()
functionality.
The open()
function takes a path string or path-like object pointing to a file of a supported raster format and returns an opened dataset object. Rasterio will open it using the proper GDAL format driver. Dataset objects have some of the same attributes as Python file objects.
import rasterio
## open file
srtm_germany = rasterio.open("../data/srtm_germany_dtm.tif", driver="GTiff")
Properties of the raster data stored in the example GeoTIFF can be accessed through attributes of the opened dataset object. Dataset objects have bands, which can be counted using .count
. Our SRTM data has a band count of 1.
srtm_germany.count
1
A dataset band is an array of values containing the distribution of a variable (here elevation) in rows and columns. Check the dimensions of the array of DN values (.width
and .height
).
print(
f"Number of columns = {srtm_germany.width}, Number of rows = {srtm_germany.height}"
)
Number of columns = 13201, Number of rows = 10801
The georeferencing of the dataset is stored in the meta data. Every pixels of a dataset is contained within a spatial bounding box (.bounds
).
srtm_germany.bounds
BoundingBox(left=4.999583333333334, bottom=46.99958333333337, right=16.000416666666624, top=56.000416666666666)
The value of DatasetReader.bounds
attribute is derived from a more fundamental attribute: the dataset’s geospatial transform, which describes the origin coordinates of the raster.
srtm_germany.transform
Affine(0.0008333333333333301, 0.0, 4.999583333333334, 0.0, -0.0008333333333333301, 56.000416666666666)
These coordinate values are relative to the origin of the dataset’s coordinate reference system (CRS). EPSG:4326 identifies a particular coordinate reference system.
srtm_germany.crs
CRS.from_epsg(4326)
You may access all meta data at once using .meta
# metadata functions from rasterio
print(srtm_germany.meta)
{'driver': 'GTiff', 'dtype': 'int16', 'nodata': None, 'width': 13201, 'height': 10801, 'count': 1, 'crs': CRS.from_epsg(4326), 'transform': Affine(0.0008333333333333301, 0.0, 4.999583333333334, 0.0, -0.0008333333333333301, 56.000416666666666)}
When working with large data sets it is recommendable to check the size of the file in advance. The processing time increases with the size of the data set.
print(f"The file size is {os.path.getsize('../data/srtm_germany_dtm.tif')/1e6} MB")
The file size is 285.254776 MB
# plot with rasterio.plot, which provides matplotlib functionality
from rasterio.plot import show
show(srtm_germany, title="Digital Elevation Model - SRTM", cmap="RdYlGn_r") # "terrain"
<AxesSubplot: title={'center': 'Digital Elevation Model - SRTM'}>
An useful function is the mask()
function from from rasterio.mask
, which creates a new raster
object that has the same values as the input raster
object, except for the masked cells, which are set to 0
by default. We apply the mask()
function to mask out the area outside the borders of Germany.
For the shapefiles of Germany we rely on shapefiles provided by the German Federal Agency for Cartography and Geodesy (under this licence). You may directly download the shapefiles here. For the purpose of this tutorial, we also provided the data here.
# Retrieve Federal States
import zipfile
url = "https://daten.gdz.bkg.bund.de/produkte/vg/vg5000_0101/aktuell/vg5000_01-01.utm32s.shape.ebenen.zip"
r = requests.get(url)
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall(path="../data")
boundary_germany = gpd.read_file(
"../data//vg5000_01-01.utm32s.shape.ebenen/vg5000_ebenen_0101/VG5000_STA.shp"
)
boundary_germany = boundary_germany.to_crs("epsg:4326")
fig, ax = plt.subplots(figsize=(5, 15))
srtm_plot = show(
srtm_germany,
ax=ax,
cmap="RdYlGn_r",
title="German Boundary and \n Digital Elevation Model - SRTM\n",
)
boundary_germany.plot(ax=ax, facecolor="none", edgecolor="lightgrey")
## add colorbar
dummy_colorbar = srtm_plot.get_images()[0]
fig.colorbar(dummy_colorbar, ax=ax, fraction=0.046, pad=0.04, label="Elevation [m]")
plt.show()
The mask()
function requires the coordinates of the GeoDataFrame
to be in a .json
format. We will achieve that through a custom function.
def convert_2_json(geodataframe):
"""
Convert to GeoDataFrame features to rasterio friendly format
Input is a GeoDataFrame
"""
import json
return [json.loads(geodataframe.to_json())["features"][0]["geometry"]]
coords = convert_2_json(boundary_germany)
Next, we are able to mask the raster with the loaded polygon by using the coords
object we just created using the mask()
function from rasterio.mask
.
from rasterio.mask import mask
srtm_germany_masked, srtm_germany_masked_transform = mask(
dataset=srtm_germany, shapes=coords, crop=True, nodata=-99
)
For further usage we re-project the data set into the ETRS89/LAEA coordinate reference system (European Terrestrial Reference System 1989/Lambert Azimuthal Equal-Area projection coordinate reference system) providing the EPSG identifier $3035$. We can achieve that through the modifying the meta data.
## copy the meta data from the original file
srtm_germany_meta = srtm_germany.meta.copy()
print(srtm_germany_meta)
{'driver': 'GTiff', 'dtype': 'int16', 'nodata': None, 'width': 13201, 'height': 10801, 'count': 1, 'crs': CRS.from_epsg(4326), 'transform': Affine(0.0008333333333333301, 0.0, 4.999583333333334, 0.0, -0.0008333333333333301, 56.000416666666666)}
crs = rasterio.crs.CRS({"init": "epsg:3035"})
# Define EPSG:3035 projection
crs_update = {"init": "epsg:3035"}
srtm_germany_meta.update(
{
"driver": "GTiff",
"height": srtm_germany_masked.shape[1],
"width": srtm_germany_masked.shape[2],
"transform": srtm_germany_masked_transform,
"crs": crs_update,
}
)
fig, ax = plt.subplots(figsize=(5, 15))
srtm_plot = show(
srtm_germany_masked,
ax=ax,
cmap="RdYlGn_r",
title="German Boundary and \n Digital Elevation Model - SRTM\n",
)
boundary_germany.plot(ax=ax, facecolor="none", edgecolor="lightgrey")
## add colorbar
dummy_colorbar = srtm_plot.get_images()[0]
fig.colorbar(dummy_colorbar, ax=ax, fraction=0.046, pad=0.04, label="Elevation [m]")
plt.show()
Perfect! The masking process was sucessful! We can write the masked raster file to disk using the following command.
out_dem = "../data/srtm_germany_dem.tif"
with rasterio.open(out_dem, "w", **srtm_germany_meta) as src:
src.write(srtm_germany_masked)
In this section we subset the SRTM raster data set to reflect a specific area of interest, East Germany. Therefore, we apply the rasterio.mask
module to the SRTM data and feed it with a GeoDataFrame
object containing polygon data, in our case the borders of East Germany. We can easily retrieve the Federal States borders of Germany by from the German Federal Agency for Cartography and Geodesy (under this licence). You may directly download the shapefiles here. For the purpose of this tutorial, we also provided the data here.
states = [
"Sachsen",
"Sachsen-Anhalt",
"Mecklenburg-Vorpommern",
"Brandenburg",
"Thüringen",
"Berlin",
]
# ## load Germany boundary file for masking
boundary_germany = gpd.read_file(
"../data//vg5000_01-01.utm32s.shape.ebenen/vg5000_ebenen_0101/VG5000_LAN.shp"
)
boundary_germany = boundary_germany.to_crs("epsg:4326")
# filter by East Germany states
boundary_east = boundary_germany[boundary_germany["GEN"].isin(states)]
# # combine all East Germany Polygons into one
boundary_east_new = gpd.GeoSeries(boundary_east["geometry"].unary_union)
fig, ax = plt.subplots(figsize=(5, 15))
srtm_plot = show(
srtm_germany,
ax=ax,
cmap="RdYlGn_r",
title="German Boundary and \n Digital Elevation Model - SRTM\n",
)
boundary_east_new.plot(ax=ax, facecolor="none", edgecolor="lightgrey")
## add colorbar
dummy_colorbar = srtm_plot.get_images()[0]
fig.colorbar(dummy_colorbar, ax=ax, fraction=0.046, pad=0.04, label="Elevation [m]")
plt.show()
We are ready to mask the raster file. We proceed as desribed above.
coords = convert_2_json(boundary_east_new)
srtm_east_masked, srtm_east_masked_transform = mask(
dataset=srtm_germany, shapes=coords, crop=True, nodata=-99
)
Please note that SRTM data set is projected in geographic coordinates. For further usage we re-project the data set into the UTM, Zone 33N coordinate reference system providing the EPSG identifier $32633$.
Further we make sure that all other data sets of interest share the same coordinate reference system.
## copy the meta data from the original file
srtm_east_meta = srtm_germany.meta.copy()
print(srtm_east_meta)
# Define EPSG:3035 projection
crs_update = {"init": "epsg:4326"}
srtm_east_meta.update(
{
"driver": "GTiff",
"height": srtm_east_masked.shape[1],
"width": srtm_east_masked.shape[2],
"transform": srtm_east_masked_transform,
"crs": crs_update,
}
)
{'driver': 'GTiff', 'dtype': 'int16', 'nodata': None, 'width': 13201, 'height': 10801, 'count': 1, 'crs': CRS.from_epsg(4326), 'transform': Affine(0.0008333333333333301, 0.0, 4.999583333333334, 0.0, -0.0008333333333333301, 56.000416666666666)}
boundary_germany = boundary_germany.to_crs("epsg:4326")
fig, ax = plt.subplots(figsize=(5, 17))
srtm_plot = show(
srtm_east_masked,
ax=ax,
cmap="RdYlGn_r",
title="German Boundary and \n Digital Elevation Model - SRTM\n",
)
## add colorbar
dummy_colorbar = srtm_plot.get_images()[0]
fig.colorbar(dummy_colorbar, ax=ax, fraction=0.046, pad=0.04, label="Elevation [m]")
plt.show()
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.