In this section we download and preprocess carbon dioxide (CO2) measurements taken at the Mauna Loa Observatory in Hawaii. The data is provided by Dr. Pieter Tans, NOAA/ESRL and Dr. Ralph Keeling, Scripps Institution of Oceanography and may be downloaded here.
The data describes the ongoing change in concentration of carbon dioxide in Earth's atmosphere since the 1950s. The data collection was initiated under the supervision of Charles David Keeling. Keeling's measurements showed the first significant evidence of rapidly increasing carbon dioxide levels in the atmosphere. If the connection is failing you may download the data set here (downloaded on June 25, 2022).
# First, let's import the needed libraries.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime
Let us get aware of the location by plotting the Mauna Loa Observatory on an interactive map. You may click on the blue dot.
import folium
# Create a map, centered on Berlin
m = folium.Map(location=[19.53611, 204.4239], zoom_start=18)
# Add marker FU in Berlin - Dahlem
folium.Marker(
location=[19.53611, 204.4239], # coordinates
popup="Mauna Loa Observatory, Hawaii", # pop-up label
).add_to(m)
# Add custom base maps to folium
basemaps = {
"Esri Satellite": folium.TileLayer(
tiles="https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}",
attr="Esri",
name="Esri Satellite",
overlay=True,
control=True,
)
}
basemaps["Esri Satellite"].add_to(m)
# Display map (m)
m
current_time = datetime.now().strftime("%Y-%m-%d")
print("Date of download: ", current_time)
Date of download: 2023-04-03
Now, we want to import the .csv
file from the URL.
import requests
from io import StringIO
url = "https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_mm_mlo.csv"
s = requests.get(url).text
co2_raw = pd.read_csv(
StringIO(s), sep=",", skiprows=56, header=1
) ## skiprows to skip the intro text, first 51 lines
co2_raw.columns = [
"year",
"month",
"decimal date",
"interpolated",
"trend season corr",
"#days",
"sdev",
"unc",
]
co2_raw.head(15)
year | month | decimal date | interpolated | trend season corr | #days | sdev | unc | |
---|---|---|---|---|---|---|---|---|
0 | 1958 | 4 | 1958.2877 | 317.45 | 315.16 | -1 | -9.99 | -0.99 |
1 | 1958 | 5 | 1958.3699 | 317.51 | 314.71 | -1 | -9.99 | -0.99 |
2 | 1958 | 6 | 1958.4548 | 317.24 | 315.14 | -1 | -9.99 | -0.99 |
3 | 1958 | 7 | 1958.5370 | 315.86 | 315.18 | -1 | -9.99 | -0.99 |
4 | 1958 | 8 | 1958.6219 | 314.93 | 316.18 | -1 | -9.99 | -0.99 |
5 | 1958 | 9 | 1958.7068 | 313.20 | 316.08 | -1 | -9.99 | -0.99 |
6 | 1958 | 10 | 1958.7890 | 312.43 | 315.41 | -1 | -9.99 | -0.99 |
7 | 1958 | 11 | 1958.8740 | 313.33 | 315.20 | -1 | -9.99 | -0.99 |
8 | 1958 | 12 | 1958.9562 | 314.67 | 315.43 | -1 | -9.99 | -0.99 |
9 | 1959 | 1 | 1959.0411 | 315.58 | 315.55 | -1 | -9.99 | -0.99 |
10 | 1959 | 2 | 1959.1260 | 316.48 | 315.86 | -1 | -9.99 | -0.99 |
11 | 1959 | 3 | 1959.2027 | 316.65 | 315.38 | -1 | -9.99 | -0.99 |
12 | 1959 | 4 | 1959.2877 | 317.72 | 315.41 | -1 | -9.99 | -0.99 |
13 | 1959 | 5 | 1959.3699 | 318.29 | 315.49 | -1 | -9.99 | -0.99 |
14 | 1959 | 6 | 1959.4548 | 318.15 | 316.03 | -1 | -9.99 | -0.99 |
The data set (co2_raw
) shows carbon dioxide (CO2) measurements taken at the Mauna Loa Observatoryin Hawaii. The data is provided by: Trends in Atmospheric Carbon Dioxide, Mauna Loa, Hawaii. Dr. Pieter Tans, NOAA/ESRL and Dr. Ralph Keeling, Scripps Institution of Oceanography.
co2_raw.columns
Index(['year', 'month', 'decimal date', 'interpolated', 'trend season corr', '#days', 'sdev', 'unc'], dtype='object')
The data set has 775 rows and 9 columns, with the following variables: 'year', 'month', 'decimal date', 'interpolated', 'trend season corr','#days', 'sdev', 'unc', 'Date'.
The interpolated
column contains the monthly mean CO2 mole fraction determined from daily averages. The mole fraction of CO2, expressed as parts per million (ppm) is the number of molecules of CO2 in every one million molecules of dried air (water vapor removed). Missing months are denoted by $-99.99$. The interpolated
column includes average values from the preceding column and interpolated values where data are missing.
Let us create an pd.Series
object with the interpolated CO2 concentrations and the corresponding date. In the original data set the date is given by the year
and the month
. We may easily combine the by converting them to characters and adding a +
operator.
co2_raw["Date"] = co2_raw["year"].astype(str) + "-" + co2_raw["month"].astype(str)
co2_raw["Date"] = pd.to_datetime(co2_raw["Date"], format="%Y-%m")
co2 = pd.Series(co2_raw["interpolated"].values, index=co2_raw["Date"])
type(co2)
pandas.core.series.Series
Once the data is captured inside a pandas.Series
object, we can easily plot the data.
plt.figure(figsize=(14, 4))
co2.plot()
plt.title("Keeling Curve")
plt.ylabel("$CO_{2}$ (ppm)")
plt.show()
This characteristic graph showing the rising the CO2 concentration over time is often referred to as Keeling Curve. Each year when the terrestrial vegetation of the Northern Hemisphere expands with the seasons, it removes CO2 from the atmosphere in its productive growing phase, while it returns CO2 to the air when it dies and decomposes. This phenomenon creates a seasonal oscillation in the atmosphere's CO2 concentration.
Finally, we store the time series data set in a .json
using the to_json
file for further processing.
co2_raw.to_json("../data/KeelingCurve.json", date_format="iso")
To read the .json
use the following command.
co2_raw = pd.read_json("../data/KeelingCurve.json")
co2_raw
year | month | decimal date | interpolated | trend season corr | #days | sdev | unc | Date | |
---|---|---|---|---|---|---|---|---|---|
0 | 1958 | 4 | 1958.2877 | 317.45 | 315.16 | -1 | -9.99 | -0.99 | 1958-04-01 |
1 | 1958 | 5 | 1958.3699 | 317.51 | 314.71 | -1 | -9.99 | -0.99 | 1958-05-01 |
2 | 1958 | 6 | 1958.4548 | 317.24 | 315.14 | -1 | -9.99 | -0.99 | 1958-06-01 |
3 | 1958 | 7 | 1958.5370 | 315.86 | 315.18 | -1 | -9.99 | -0.99 | 1958-07-01 |
4 | 1958 | 8 | 1958.6219 | 314.93 | 316.18 | -1 | -9.99 | -0.99 | 1958-08-01 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
774 | 2022 | 10 | 2022.7917 | 415.78 | 419.13 | 30 | 0.27 | 0.10 | 2022-10-01 |
775 | 2022 | 11 | 2022.8750 | 417.51 | 419.51 | 25 | 0.52 | 0.20 | 2022-11-01 |
776 | 2022 | 12 | 2022.9583 | 418.95 | 419.64 | 24 | 0.50 | 0.20 | 2022-12-01 |
777 | 2023 | 1 | 2023.0417 | 419.47 | 419.14 | 31 | 0.40 | 0.14 | 2023-01-01 |
778 | 2023 | 2 | 2023.1250 | 420.41 | 419.49 | 25 | 0.64 | 0.25 | 2023-02-01 |
779 rows × 9 columns
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.