In this section we download and preprocess carbon dioxide (CO2) measurements taken at the Mauna Loa Observatory in Hawaii. The data is provided by Dr. Pieter Tans, NOAA/ESRL and Dr. Ralph Keeling, Scripps Institution of Oceanography and may be downloaded here.

The data describes the ongoing change in concentration of carbon dioxide in Earth’s atmosphere since the 1950s. The data collection was initiated under the supervision of Charles David Keeling. Keeling’s measurements showed the first significant evidence of rapidly increasing carbon dioxide levels in the atmosphere. If the connection is failing you may download the data set here (downloaded on June 14, 2022).

paste("Date of download: ", Sys.time())
## [1] "Date of download:  2025-10-11 16:28:58.865159"
url <- "https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_mm_mlo.txt"
co2_raw <- read.csv(url,
  skip = 58,
  sep = "",
  header = FALSE
)

header <- c(
  "year",
  "month",
  "decimal date",
  "monthly average",
  "de-seasonalized",
  "#days",
  "st. dev of days",
  "unc. of mon mean"
)
colnames(co2_raw) <- header
tail(co2_raw)[1:6]
##     year month decimal date monthly average de-seasonalized #days
## 789 2025     3     2025.208          428.15          426.68    27
## 790 2025     4     2025.292          429.64          427.13    23
## 791 2025     5     2025.375          430.51          427.26    23
## 792 2025     6     2025.458          429.61          427.16    26
## 793 2025     7     2025.542          427.87          427.45    24
## 794 2025     8     2025.625          425.48          427.36    24

The data set has 794 rows and 8 columns, with the following variables: year, month, decimal date, monthly average, de-seasonalized, #days.

The monthly average column contains the monthly mean CO2 mole fraction constructed from daily mean values. The mole fraction of CO2, expressed as parts per million (ppm) is the number of CO2 molecules within a total of one million dried air molecules (water vapor removed).

Let us create an xts object with the interpolated CO2 concentrations and the corresponding date. In the original data set the date is given by the year and the month. We may easily combine them by using the paste() function.

library(xts)
dt <- as.yearmon(paste(co2_raw$year, co2_raw$month, sep = "-"))
co2 <- xts(co2_raw$`monthly average`, dt)
str(co2)
## An xts object on Jul 1959 / Aug 2025 containing: 
##   Data:    double [794, 1]
##   Index:   yearmon [794] (TZ: "UTC")

Once the data is captured inside an xts object, we extract the pure data via the coredata() function, and the date via the index() function.

coredata(co2)[1:6]
## [1] 316.54 314.80 313.84 313.33 314.81 315.58
index(co2)[1:6]
## [1] "Jul 1959" "Aug 1959" "Sep 1959" "Oct 1959" "Nov 1959" "Dec 1959"

Let us plot the data.

plot.ts(index(co2),co2,
  type='l',
  main = "Keeling Curve",
  ylab = expression("CO"[2] * " fraction in dry air (ppm)"),
  xlab = "Year"
)

This characteristic graph showing the rising of the CO2 concentration over time is often referred to as Keeling Curve. Each year when the terrestrial vegetation of the Northern Hemisphere expands with the seasons, it removes CO2 from the atmosphere in its productive growing phase, while it returns CO2 to the air when it dies and decomposes. This phenomenon creates a seasonal oscillation in the atmosphere’s CO2 concentration.

Finally, we store the time series data set in a .RData file for further processing.

save(file = "KeelingCurve.Rdata", co2)

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.