The nearest neighbor interpolation (NNI) is a very simple interpolation approach. Recall that interpolation seeks for a given sample of observations $\{z_1, z_2,..., z_n\}$ at locations $\{x_1, x_2,..., x_n\}$ to estimate the value $z$ at some new point $\mathbf x$.
The NNI approach uses the value $z_i$ that is closest to $\mathbf x$. Thus, the algorithm seeks to find $i$ such that $\vert \mathbf x_i - \mathbf x \vert$ is minimized, then the estimate of $z$ is $z_i$.
It should be noted that the set of closest points to $\mathbf x_i$ forms a Thiessen poylgon, also known as Voronoi polygon. The technique used to compute these polygons is referred to as Delauny triangulation. Such a particular polygon corresponds to a region consisting of all points closer to $\mathbf x_i$ than to any other $\mathbf x$.
# First, let's import the needed libraries.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
import random
In Python we use the scipy.spatial
module, which provides the Voronoi()
function to create Voronoi polygons for a set of points. The input points data set consists of coordinates of points (x
and y
coordinates), which is typically a two dimensional array. The voronoi_plot_2d()
function allows us to plot the given Voronoi diagram in 2-D.
In order to showcase the function we generate 25 random data points, build Thiessen poylgons and plot them.
random.seed(10) # set seed for reproducibility
n = 25 # number of points
x = np.random.uniform(-1, 1, n)
y = np.random.uniform(-1, 1, n)
m = list(zip(x, y))
from scipy.spatial import Voronoi, voronoi_plot_2d
vor = Voronoi(m)
fig = voronoi_plot_2d(
vor,
show_vertices=False,
line_colors="black",
line_alpha=1,
)
plt.plot(x, y, "o", color="black", zorder=2)
plt.title("Thiessen poylgons for \nrandom data points")
plt.xticks([])
plt.yticks([])
plt.show()
It is fairly straight forward to apply this approach on a real data set. Let us compute the Voronoi polygons for the weather station data set East_Germany.csv
, in particular for the mean annual temperature and the mean annual rainfall in East Germany. We will make use fo the geovoronoi
package.
data_east = pd.read_csv(
"http://userpage.fu-berlin.de/soga/soga-py/300/30900_spatial_interpolation/East_Germany.csv"
)
data_east = gpd.GeoDataFrame(
data_east,
geometry=gpd.points_from_xy(data_east.longitude, data_east.latitude),
crs=4326,
)
from shapely.ops import cascaded_union
from geovoronoi.plotting import subplot_for_map, plot_voronoi_polys_with_points_in_area
from geovoronoi import voronoi_regions_from_coords, points_to_coords
For some context, we will first plot the weather stations of East Germany and the boundaries of the states of East Germany on a map. For the shapefiles of Germany we rely on shapefiles provided by the German Federal Agency for Cartography and Geodesy (under this licence). You may directly download the shapefiles here. For the purpose of this tutorial, we also provided the data here.
states = [
"Sachsen",
"Sachsen-Anhalt",
"Mecklenburg-Vorpommern",
"Brandenburg",
"Thüringen",
"Berlin",
]
# ## load Germany boundary file for masking
boundary_germany = gpd.read_file(
"../data//vg5000_01-01.utm32s.shape.ebenen/vg5000_ebenen_0101/VG5000_LAN.shp"
)
boundary_germany = boundary_germany.to_crs("epsg:4326")
# filter by East Germany states
boundary_east = boundary_germany[boundary_germany["GEN"].isin(states)]
# # combine all East Germany Polygons into one
boundary_east_new = gpd.GeoSeries(boundary_east["geometry"].unary_union)
fig, ax = plt.subplots(figsize=(6, 5))
boundary_east.plot(ax=ax, color="none")
data_east.plot(ax=ax, markersize=3.5, color="black")
plt.axis("equal")
plt.show()
We have to make sure that our data is in an appropriate data format for the Voronoi()
function. Therefore, we will convert the coordinate columns (longitude
and latitude
) and the climate data (Temperature
and Rainfall
) into a NumPy
arrays.
points = list(zip(data_east.longitude, data_east.latitude))
temp = data_east.Temperature.values
vor = Voronoi(points)
Now we are able to calculate Voronoi polygons by using voronoi_plot_2d()
from scipy.spatial
.
## built colormap
import matplotlib as mpl
import matplotlib.cm as cm
norm = mpl.colors.Normalize(vmin=min(temp), vmax=max(temp), clip=True)
mapper = cm.ScalarMappable(norm=norm, cmap=cm.viridis)
## code is inspired by:
## https://stackoverflow.com/questions/56904546/how-to-add-information-to-a-scipy-spatial-voronoi-plot
from matplotlib import gridspec
norm = mpl.colors.Normalize(vmin=min(temp), vmax=max(temp), clip=True)
mapper = cm.ScalarMappable(norm=norm, cmap=cm.viridis)
# create the figure with set figure size
fig = plt.figure(figsize=(7, 7))
# puts a grid on the figure
gs = gridspec.GridSpec(14, 10)
# creates two subplots
ax = plt.subplot2grid((14, 20), (0, 15), colspan=1, rowspan=14)
ax2 = plt.subplot2grid((14, 20), (0, 0), colspan=14, rowspan=14)
# creates a colourbar on the first subplot
cb1 = mpl.colorbar.ColorbarBase(ax, cmap=cm.viridis, norm=norm, orientation="vertical")
mpl.rcParams.update({"font.size": 10})
# plots the voronoi diagram on the second subplot
voronoi_plot_2d(
vor,
show_points=True,
show_vertices=False,
s=1,
line_colors="darkgrey",
line_alpha=1,
ax=ax2,
)
# colours the voronoi cells
for r in range(len(vor.point_region)):
region = vor.regions[vor.point_region[r]]
if not -1 in region:
polygon = [vor.vertices[i] for i in region]
plt.fill(*zip(*polygon), color=mapper.to_rgba(temp[r]))
## add base map
boundary_east.plot(ax=ax2, color="none", edgecolor="black")
plt.title("Mean annual temperature [°C]")
plt.show()
Repeat the procedure for the rainfall data.
points = list(zip(data_east.longitude, data_east.latitude))
rain = data_east.Rainfall.values
vor = Voronoi(points)
## code is inspired by:
## https://stackoverflow.com/questions/56904546/how-to-add-information-to-a-scipy-spatial-voronoi-plot
norm = mpl.colors.Normalize(vmin=min(rain), vmax=max(rain), clip=True)
mapper = cm.ScalarMappable(norm=norm, cmap=cm.viridis)
# create the figure with set figure size
fig = plt.figure(figsize=(7, 7))
# puts a grid on the figure
gs = gridspec.GridSpec(14, 10)
# creates two subplots
ax = plt.subplot2grid((14, 20), (0, 15), colspan=1, rowspan=14)
ax2 = plt.subplot2grid((14, 20), (0, 0), colspan=14, rowspan=14)
# creates a colourbar on the first subplot
cb1 = mpl.colorbar.ColorbarBase(ax, cmap=cm.viridis, norm=norm, orientation="vertical")
mpl.rcParams.update({"font.size": 10})
# plots the voronoi diagram on the second subplot
voronoi_plot_2d(
vor,
show_points=True,
show_vertices=False,
s=1,
line_colors="darkgrey",
line_alpha=1,
ax=ax2,
)
# colours the voronoi cells
for r in range(len(vor.point_region)):
region = vor.regions[vor.point_region[r]]
if not -1 in region:
polygon = [vor.vertices[i] for i in region]
plt.fill(*zip(*polygon), color=mapper.to_rgba(rain[r]))
## add base map
boundary_east.plot(ax=ax2, color="none", edgecolor="black")
plt.title("Mean annual rainfall [mm]")
plt.show()
Looking at the plots is becomes obvious that a problematic aspect of this approach is that the interpolated surfaces ares discontinuous, and these discontinuities are an artifact of the locations of the samples (Brundson and Comber 2015).
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.