Centrography¶

Centrography is a set of descriptive statistics that provide summary descriptions for point patterns with a fokus on centrality in a pattern (location and dispersion). In this chapter, will have a closer look at three types of centrography analysis for point patterns:

  • Tendency
  • Dispersion
  • Extent and Shape

Centrography includes different measures to provide an indication of the location of a point pattern, how densely the points cluster around a center and how irregular the shape of the pattern might be.

First, we load the Berlin city data sets we created in the previous section.

In [2]:
# First, let's import the needed libraries.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import geopandas as gpd
import seaborn as sns
import contextily
In [3]:
from pointpats import centrography, PointPattern, skyum
In [4]:
# load berlin shapefile
berlin_district = gpd.read_file(
    "http://userpage.fu-berlin.de/soga/soga-py/300/30800_spatial_point_patterns/berlin_district.zip"
)
# load data from the previous section
berlin_locations = gpd.read_file(
    "http://userpage.fu-berlin.de/soga/soga-py/300/30800_spatial_point_patterns/berlin_locations.geojson"
)
berlin_random = gpd.read_file(
    "http://userpage.fu-berlin.de/soga/soga-py/300/30800_spatial_point_patterns/berlin_random.geojson"
)

berlin_regular = gpd.read_file(
    "http://userpage.fu-berlin.de/soga/soga-py/300/30800_spatial_point_patterns/berlin_regular.geojson"
)

When using the pointpats package, we will have to reshape our coodinate data and convert it to a PointPattern data type.

In [5]:
## reshape coordinates to right input format ([x1,y1], [x2,y2],....[xn, yn])
## extract coordinates from GeoPandas as array

coord_list_loc = np.array(
    [(x, y) for x, y in zip(berlin_locations.geometry.x, berlin_locations.geometry.y)]
)

## extract coordinates from GeoPandas as PointPattern
pp_loc = PointPattern(coord_list_loc)
C:\Users\mceck\miniconda3\envs\rasterdata\lib\site-packages\libpysal\cg\shapes.py:1492: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
  warnings.warn(dep_msg, FutureWarning)
C:\Users\mceck\miniconda3\envs\rasterdata\lib\site-packages\libpysal\cg\shapes.py:1208: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
  warnings.warn(dep_msg, FutureWarning)

Tendency¶

The Tendency deals with the center point of the two-dimensional distribution. There are several different tendency measure. Which one to use depends on our objective and data (click here for more information). Here, we will explore two examples of tendency measures.

Mean Center ($x_{mc}$,$y_{mc}$)

$$ x_{mc} = \frac{1}{n} \sum_{i=1}^n x_i$$$$ y_{mc} = \frac{1}{n} \sum_{i=1}^n y_i$$

The median center works jsut like the median measure that we already know: it represents a point where half of the data is above or below and left or right, respectively.

Euclidean Median ($x_{em}$,$y_{em}$)

$$ minf(x_{em},y_{em})=\sum_{i=1}^n \sqrt{(x_i−x_{em})^2+(y_i−y_{em})^2}$$

Euclidean Median describes the point, from which the sum of the Euclidean distances to all other points is at its minimum. It is an optimization problem and thus follow an iterative approach (Kuhn and Kuenne, 1962) to approximate.

In [6]:
mean_center = centrography.mean_center(pp_loc.points)
euclid_median = centrography.euclidean_median(pp_loc.points)
In [7]:
# scatter plot
scatterplot = sns.jointplot(
    x=berlin_locations.geometry.x,
    y=berlin_locations.geometry.y,
    data=berlin_locations,
    s=6,
    height=10,
    color="grey",
)

# mean point
scatterplot.ax_joint.scatter(
    *mean_center, color="seagreen", marker="x", s=50, label="Mean Center"
)
# marginal lines
scatterplot.ax_marg_x.axvline(mean_center[0], color="seagreen")
scatterplot.ax_marg_y.axhline(mean_center[1], color="seagreen")


######
# euclidian median
scatterplot.ax_joint.scatter(
    *euclid_median, color="red", marker=".", s=50, label="Euclidian Median"
)
scatterplot.ax_marg_x.axvline(euclid_median[0], color="red")
scatterplot.ax_marg_y.axhline(euclid_median[1], color="red")

# Legend
scatterplot.ax_joint.legend()
# Add basemap
contextily.add_basemap(
    scatterplot.ax_joint, source=contextily.providers.CartoDB.Positron
)

plt.show()

Dispersion¶

The dispersion of a point pattern can be measured using the standard distance, which provides the average distance from the point pattern center.

$$ SD = \sqrt{ \frac{\sum_{i=1}^n (x_i−x_{em})^2}{n} + \frac{\sum_{i=1}^n (y_i−y_{em})^2}{n} } $$

The standard distance is implemented in pointpats.centrography through the std_distance() function:

In [8]:
stdist_loc = centrography.std_distance(pp_loc.points)
stdist_loc
Out[8]:
5780.60268493892

On average the berlin nightlife locations are 5780 m from the center.

We can visualize the dispersion with the help of the standard deviational ellipse or standard ellipse. This ellipse depicts the center, dispersion and orientation of the point pattern.

In [9]:
## calculate axis of ellipse

axis_major, axis_minor, rotation = centrography.ellipse(pp_loc.points)
In [10]:
from matplotlib.patches import Ellipse

f, ax = plt.subplots(1, figsize=(9, 9))

# Scatterplot + Mean Center + Euclidean Median
ax.scatter(
    x=berlin_locations.geometry.x, y=berlin_locations.geometry.y, s=2, color="grey"
)
ax.scatter(*mean_center, color="seagreen", marker="x", label="Mean Center")
ax.scatter(*euclid_median, color="red", marker="o", label="Euclidean Median")

# standard ellipse using matplotlib
ellipse = Ellipse(
    xy=mean_center,  # center the ellipse on our mean center
    width=axis_major * 2,  # centrography.ellipse only gives half the axis
    height=axis_minor * 2,
    angle=np.rad2deg(rotation),  #  convert Angles in degrees, not radians
    facecolor="none",
    edgecolor="red",
    linestyle="--",
    label="standard ellipse",
)
ax.add_patch(ellipse)

ax.legend()

contextily.add_basemap(ax, source=contextily.providers.CartoDB.Positron)

plt.show()

Shape and Extent¶

The extent of a point cloud can be described using four shapes. Each of them reflects varying levels of how tightly the point in a pattern are bound together. If you want to learn more about shape and extent see here for more information on different shapes, which this page is based on or here for more information on shape analysis.

Convex Hull

The convex hull of a point pattern is the smallest/ tightest convex shape. It comes with some characteristics, since it has no indentations, dips, pinnacles or holes and all interior angles are smaller than 180°.

The convex hull is implemented in pointpats.centrography through the hull() function.

In [11]:
convex_hull = centrography.hull(coord_list_loc)

Set hull = True in PointPattern class method plot to easily plot the convex hull.

Minimum Bounding Rectangle

Minimum Bounding Rectangle is the tightest rectangle that can be drawn around the pattern that contains all of points, which makes it almost always bigger than convex hull. There are two kinds of Minimum Bounding Rectangles. First, lets look at the minimum bounding rectangle that can be drawn by using vertical and horizontal lines.

The Minimum Bounding Rectangle is implemented in pointpats.centrography through the minimum_bounding_rectangle() function. It calculates maximum left, right, up and down value of the vertices of minimum bounding rectangle.

In [12]:
mbr = centrography.minimum_bounding_rectangle(coord_list_loc)
mbr
Out[12]:
(780782.004452564, 5813040.7318415325, 814411.719518311, 5841533.01896305)

Set window = True in PointPattern class method plot to easily plot the Minimum Bounding Rectangle.

In [13]:
plt.figure(figsize=(9, 9))
pp_loc.plot(
    title="Berlin Nightlife Locations", hull=True, window=True
)  # plot point pattern "pp", convex hull, and Minimum Bounding Rectangle
plt.scatter(*mean_center, color="green", marker="x", label="Mean Center")
plt.scatter(*euclid_median, color="red", marker="+", label="Euclidean Median")


plt.legend()
plt.show()
<Figure size 900x900 with 0 Axes>

Minimum Rotated Rectangle

Rotating the outline of a rectangle might yield in a tighter rectangular bound around the point pattern.

The Minimum Rotated Rectangle is implemented in pointpats.centrography through the minimum_rotated_rectangle() function.

In [14]:
mrr = centrography.minimum_rotated_rectangle(coord_list_loc)
mrr
Out[14]:
array([[ 778003.4 , 5838746.5 ],
       [ 780982.5 , 5811185.5 ],
       [ 815061.9 , 5814869.5 ],
       [ 812082.75, 5842430.5 ]], dtype=float32)

Minimum Bounding Circle

The Minimum Bounding Circle is the smallest circle that can be drawn around the point pattern.

The Minimum Bounding Circle is implemented in pointpats.centrography through the minimum_bounding_circle() function.

In [15]:
coord_list_loc
Out[15]:
array([[ 793357.02215207, 5832557.61648877],
       [ 797981.40855913, 5827060.34345127],
       [ 797927.06102323, 5827157.15705207],
       ...,
       [ 800919.32625033, 5826552.44394573],
       [ 797391.9179632 , 5831688.91623683],
       [ 795508.68296604, 5824958.26911632]])
In [16]:
# (center_x, center_y), radius = centrography.minimum_bounding_circle(coord_list_loc)

Now we will plot all of these bounding boxes together using matplotlib.patches

In [17]:
from matplotlib.patches import Polygon, Circle, Rectangle
In [18]:
# convex hull
convex_hull_patch = Polygon(
    convex_hull,
    closed=True,
    edgecolor="navy",
    facecolor="none",
    linestyle=":",
    linewidth=2,
    label="Convex Hull",
)

# minimum bounding rectangle
min_rect_patch = Rectangle(
    mbr[0:2],
    width=mbr[2] - mbr[0],
    height=mbr[2] - mbr[0],
    edgecolor="magenta",
    facecolor="none",
    linestyle="dashed",
    linewidth=2,
    label="Minimum Bounding Rectangle",
)

## minimum rotatad rect

min_rot_rect_patch = Polygon(
    mrr,
    closed=True,
    edgecolor="limegreen",
    facecolor="none",
    linestyle="dashed",
    label="Minimum Rotated Rectangle",
    linewidth=2,
)

# minimum bounding circle
# circle_patch = Circle(
#     (center_x, center_y),
#     radius=radius,
#     edgecolor='crimson',
#     facecolor='none',
#     linewidth=2,
#     label='Min Bounding Circle'
# )
In [19]:
f, ax = plt.subplots(1, figsize=(10, 10))

ax.add_patch(convex_hull_patch)
ax.add_patch(min_rot_rect_patch)
ax.add_patch(min_rect_patch)
# ax.add_patch(circle_patch)

ax.scatter(
    x=berlin_locations.geometry.x, y=berlin_locations.geometry.y, s=2, color="grey"
)

ax.legend()

# Add basemap
contextily.add_basemap(ax, source=contextily.providers.CartoDB.Positron)
plt.show()

The broadest outlines are the minimum bounding rectangle and circle, as they include an area much bigger than the point pattern itself. One they are also quite simple to understand and to draw.


Citation

The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.