307032_Modern_carbon_dioxide

In this section we download and preprocess carbon dioxide (CO₂) measurements taken at the Mauna Loa Observatory in Hawaii. The data is provided by Dr. Pieter Tans, NOAA/ESRL and Dr. Ralph Keeling, Scripps Institution of Oceanography and may be downloaded here.

The data describes the ongoing change in concentration of carbon dioxide in Earth's atmosphere since the 1950s. The data collection was initiated under the supervision of Charles David Keeling. Keeling's measurements showed the first significant evidence of rapidly increasing carbon dioxide levels in the atmosphere. If the connection is failing you may download the data set here (downloaded on June 25, 2022).

In [2]:

# First, let's import the needed libraries.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

from datetime import datetime

Let us get aware of the location by plotting the Mauna Loa Observatory on an interactive map. You may click on the blue dot.

In [3]:

import folium

# Create a map, centered on Berlin
m = folium.Map(location=[19.53611, 204.4239], zoom_start=18)

# Add marker FU in Berlin - Dahlem
folium.Marker(
    location=[19.53611, 204.4239],  # coordinates
    popup="Mauna Loa Observatory, Hawaii",  # pop-up label
).add_to(m)


# Add custom base maps to folium
basemaps = {
    "Esri Satellite": folium.TileLayer(
        tiles="https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}",
        attr="Esri",
        name="Esri Satellite",
        overlay=True,
        control=True,
    )
}

basemaps["Esri Satellite"].add_to(m)


# Display map (m)
m

Out[3]:

Make this Notebook Trusted to load map: File -> Trust Notebook

In [4]:

current_time = datetime.now().strftime("%Y-%m-%d")
print("Date of download: ", current_time)

Date of download:  2023-04-03

Now, we want to import the .csv file from the URL.

In [5]:

import requests
from io import StringIO


url = "https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_mm_mlo.csv"
s = requests.get(url).text

co2_raw = pd.read_csv(
    StringIO(s), sep=",", skiprows=56, header=1
)  ## skiprows to skip the intro text, first 51 lines

In [6]:

co2_raw.columns = [
    "year",
    "month",
    "decimal date",
    "interpolated",
    "trend season corr",
    "#days",
    "sdev",
    "unc",
]

co2_raw.head(15)

Out[6]:

	year	month	decimal date	interpolated	trend season corr	#days	sdev	unc
0	1958	4	1958.2877	317.45	315.16	-1	-9.99	-0.99
1	1958	5	1958.3699	317.51	314.71	-1	-9.99	-0.99
2	1958	6	1958.4548	317.24	315.14	-1	-9.99	-0.99
3	1958	7	1958.5370	315.86	315.18	-1	-9.99	-0.99
4	1958	8	1958.6219	314.93	316.18	-1	-9.99	-0.99
5	1958	9	1958.7068	313.20	316.08	-1	-9.99	-0.99
6	1958	10	1958.7890	312.43	315.41	-1	-9.99	-0.99
7	1958	11	1958.8740	313.33	315.20	-1	-9.99	-0.99
8	1958	12	1958.9562	314.67	315.43	-1	-9.99	-0.99
9	1959	1	1959.0411	315.58	315.55	-1	-9.99	-0.99
10	1959	2	1959.1260	316.48	315.86	-1	-9.99	-0.99
11	1959	3	1959.2027	316.65	315.38	-1	-9.99	-0.99
12	1959	4	1959.2877	317.72	315.41	-1	-9.99	-0.99
13	1959	5	1959.3699	318.29	315.49	-1	-9.99	-0.99
14	1959	6	1959.4548	318.15	316.03	-1	-9.99	-0.99

The data set (co2_raw) shows carbon dioxide (CO₂) measurements taken at the Mauna Loa Observatoryin Hawaii. The data is provided by: Trends in Atmospheric Carbon Dioxide, Mauna Loa, Hawaii. Dr. Pieter Tans, NOAA/ESRL and Dr. Ralph Keeling, Scripps Institution of Oceanography.

In [7]:

co2_raw.columns

Out[7]:

Index(['year', 'month', 'decimal date', 'interpolated', 'trend season corr',
       '#days', 'sdev', 'unc'],
      dtype='object')

The data set has 775 rows and 9 columns, with the following variables: 'year', 'month', 'decimal date', 'interpolated', 'trend season corr','#days', 'sdev', 'unc', 'Date'.

The interpolated column contains the monthly mean CO₂ mole fraction determined from daily averages. The mole fraction of CO₂, expressed as parts per million (ppm) is the number of molecules of CO₂ in every one million molecules of dried air (water vapor removed). Missing months are denoted by $-99.99$. The interpolated column includes average values from the preceding column and interpolated values where data are missing.

Let us create an pd.Series object with the interpolated CO₂ concentrations and the corresponding date. In the original data set the date is given by the year and the month. We may easily combine the by converting them to characters and adding a + operator.

In [8]:

co2_raw["Date"] = co2_raw["year"].astype(str) + "-" + co2_raw["month"].astype(str)
co2_raw["Date"] = pd.to_datetime(co2_raw["Date"], format="%Y-%m")

In [9]:

co2 = pd.Series(co2_raw["interpolated"].values, index=co2_raw["Date"])
type(co2)

Out[9]:

pandas.core.series.Series

Once the data is captured inside a pandas.Series object, we can easily plot the data.

In [10]:

plt.figure(figsize=(14, 4))
co2.plot()

plt.title("Keeling Curve")
plt.ylabel("$CO_{2}$ (ppm)")
plt.show()

This characteristic graph showing the rising the CO₂ concentration over time is often referred to as Keeling Curve. Each year when the terrestrial vegetation of the Northern Hemisphere expands with the seasons, it removes CO₂ from the atmosphere in its productive growing phase, while it returns CO₂ to the air when it dies and decomposes. This phenomenon creates a seasonal oscillation in the atmosphere's CO₂ concentration.

Finally, we store the time series data set in a .json using the to_json file for further processing.

In [11]:

co2_raw.to_json("../data/KeelingCurve.json", date_format="iso")

To read the .json use the following command.

In [12]:

co2_raw = pd.read_json("../data/KeelingCurve.json")
co2_raw

Out[12]:

	year	month	decimal date	interpolated	trend season corr	#days	sdev	unc	Date
0	1958	4	1958.2877	317.45	315.16	-1	-9.99	-0.99	1958-04-01
1	1958	5	1958.3699	317.51	314.71	-1	-9.99	-0.99	1958-05-01
2	1958	6	1958.4548	317.24	315.14	-1	-9.99	-0.99	1958-06-01
3	1958	7	1958.5370	315.86	315.18	-1	-9.99	-0.99	1958-07-01
4	1958	8	1958.6219	314.93	316.18	-1	-9.99	-0.99	1958-08-01
...	...	...	...	...	...	...	...	...	...
774	2022	10	2022.7917	415.78	419.13	30	0.27	0.10	2022-10-01
775	2022	11	2022.8750	417.51	419.51	25	0.52	0.20	2022-11-01
776	2022	12	2022.9583	418.95	419.64	24	0.50	0.20	2022-12-01
777	2023	1	2023.0417	419.47	419.14	31	0.40	0.14	2023-01-01
778	2023	2	2023.1250	420.41	419.49	25	0.64	0.25	2023-02-01

779 rows × 9 columns

Citation

The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.