Working with time series data is relatively straight forward with R. However, one issue needs special attention. R is a object-oriented programming language. This means that we have to be aware of the data representation, referred to as object class, as this representation dictates which functions will be available for loading, processing, analyzing, printing and plotting our data.

The core data object for holding data in R is the data.frame object. The data.frame object, however, is not designed to work with time series data efficiently. Fortunately, there are several R packages, such as ts, zoo, xts, lubridate and forecast, among others, with functions for creating, manipulating and visualizing time, date and time series objects. However, along with the variety of available packages comes a variety of different object classes and data representations.

The base distribution of R includes a time series class called ts. This object class is broadly used for the representation of time series data, however, the associated functions are limited in scope, usefulness, and power. These deficits caused the emergence of many additional and substitutional third party packages, which extend the functionality of the ts object class, or even provide very different forms of data representation by introducing their own object class.

An excellent overview of functionality and applications for time series analysis and available R packages is given by CRAN Task View on Time Series Analysis. At the time of writing there were 342 R packages available that deal or at least are associated with time series analysis.

In the subsequent section we will deal mostly with the following packages:

Please be aware that the functions we apply in the subsequent sections may be implemented in other packages as well. Depending on your particular application it might useful to look out for other packages as well.


The Date class

The base R Date class handles dates without times. The Date class by default represents dates internally as the number of days since
January 1, 1970. Using the as.Date() function allows us to create Date objects from a character string. The default format is “YYYY/m/d” or “YYYY-m-d”.

myDate <- as.Date("2021/11/6")
myDate
## [1] "2021-11-06"
class(myDate)
## [1] "Date"

The additional argument format allows more flexibility in creating Date objects.

as.Date("12/31/1999", format = "%m/%d/%Y")
## [1] "1999-12-31"
as.Date("April 13, 1978", format = "%B %d, %Y")
## [1] "1978-04-13"
as.Date("25JAN17", format = "%d%b%y")
## [1] "2017-01-25"

The standard date format codes are given in the table below:

\[ \begin{array}{|c|l|} \hline \text{Code} & \text{Value} \\ \hline \mathtt{\%d} & \text{Day of the month (number)} \\ \mathtt{\%m} & \text{Month (number)} \\ \mathtt{\%b} & \text{Month (abbreviated)} \\ \mathtt{\%B} & \text{Month (full name)} \\ \mathtt{\%y} & \text{Year (2 digit)} \\ \mathtt{\%Y} & \text{Year (4 digit)} \\ \hline \end{array} \]

The format() function is used to extract a component from the Date object.

myDate
## [1] "2021-11-06"
format(myDate, "%Y")
## [1] "2021"
as.numeric(format(myDate, "%Y"))
## [1] 2021

In addition, the weekdays(), months() and quarters() functions can be used to extract specific components of Date objects.

weekdays(myDate)
## [1] "Saturday"
months(myDate)
## [1] "November"
quarters(myDate)
## [1] "Q4"

A sequence of dates can be created with the function seq(). In this case we need to specify the starting date (from), the ending date (to) and the increment (by) of the sequence. The by increment is a character string, containing one of “day”, “week”, “month” or “year”, and can be preceded by a (positive or negative) integer and a space.

seq(
  from = as.Date("2021/6/1"),
  to = as.Date("2021/7/31"),
  by = "1 week"
)
## [1] "2021-06-01" "2021-06-08" "2021-06-15" "2021-06-22" "2021-06-29"
## [6] "2021-07-06" "2021-07-13" "2021-07-20" "2021-07-27"

The POSIXt classes

The base R POSIXt classes allow for dates and times with control of time zones. There are two POSIXt sub‐classes available in R: POSIXct and POSIXlt. The POSIXct class represents date‐time values as the signed number of seconds since midnight GMT (UTC – universal time , coordinated) 1970‐01‐01. The POSIXlt class represents date‐time values as a named list with elements for the second (sec), minute (min), hour (hour), day of the month (mday), month (mon), year (year), day of the week (wday), day of the year (yday) and daylight savings time flag (isdst).

The as.POSIXct() function allows us to create POSIXct objects from a character string representation of a date‐time. The default format of the date‐time is “YYYY-mm-dd hh:mm:ss” or “YYYY/mm/dd hh:mm:ss” with the hour, minute and second information being optional.

myDateTime <- "2021-11-06 22:10:35"
myDateTime
## [1] "2021-11-06 22:10:35"
as.POSIXct(myDateTime)
## [1] "2021-11-06 22:10:35 CET"

If no time zone specification is given in the optional argument tz, then the default value specifies the local system specific time zone as given by the Sys.timezone() function.

Sys.timezone()
## [1] "Europe/Berlin"

Again, the optional format argument is used if the date‐time string is not in the default format.

as.POSIXct("30-6-2021 23:25", format = "%d-%m-%Y %H:%M")
## [1] "2021-06-30 23:25:00 CEST"

The most common set of format codes for representing character date-times are listed in the help file for the function strptime() (type help(strptime) into your console).

A POSIXlt object can be created using the as.POSIXlt() or strptime() functions. That allows us to extract a particular component from the POSIXlt object using the $ notation.

myDateTime_POSIXlt <- as.POSIXlt(myDateTime)
myDateTime_POSIXlt
## [1] "2021-11-06 22:10:35 CET"
myDateTime_POSIXlt$sec
## [1] 35
myDateTime_POSIXlt$min
## [1] 10
myDateTime_POSIXlt$hour
## [1] 22

Converting POSIXt objects to Date objects removes time as well as time zone information.

as.Date(myDateTime_POSIXlt)
## [1] "2021-11-06"

Additional packages

In the next paragraph we introduce the lubridate package. For further information type vignette("lubridate") into your console. The lubridate package provides a variety of functions that make it easier to work with dates and times in R.

The lubridate package makes parsing of date-times easy and fast by providing functions such as ymd(), ymd_hms(), dmy(), dmy_hms(), mdy(), among others. These allow us to convert a number into a date-time object.

library(lubridate)

ymd(19991215) # year-month-date
## [1] "1999-12-15"
ymd_hm(199912151533) # year-month-date-hour-minute
## [1] "1999-12-15 15:33:00 UTC"
mdy("April 13, 1978") # month date year
## [1] "1978-04-13"
dmy(241221) # day-month-year
## [1] "2021-12-24"

Further, the lubridate package provides simple functions to get and set components of a date-time, such as year(), month(), week(), mday(), wday(), yday(),hour(), minute() and second():

today <- Sys.time()
today
## [1] "2023-06-01 18:14:03 CEST"
year(today) # year
## [1] 2023
month(today) # month
## [1] 6
month(today, label = TRUE) # labeled month
## [1] Jun
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
month(today, label = TRUE, abbr = FALSE) # labeled month
## [1] June
## 12 Levels: January < February < March < April < May < June < ... < December
week(today) # week
## [1] 22
mday(today) # day
## [1] 1
wday(today) # weekday
## [1] 5
wday(today, label = TRUE) # labeled weekday
## [1] Thu
## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
wday(today, label = TRUE, abbr = FALSE) # labeled weekday
## [1] Thursday
## 7 Levels: Sunday < Monday < Tuesday < Wednesday < Thursday < ... < Saturday
yday(today) # day of the year
## [1] 152
hour(today) # hour
## [1] 18
minute(today) # minute
## [1] 14
second(today) # second
## [1] 3.410492

In addition to the variety of functions listed above, the as.yearmon() and the as.yearqtr() functions from the zoo package are convenient when working with regularly spaced monthly and quarterly data.

library(zoo)

as.yearmon(today)
## [1] "Jun 2023"
format(as.yearmon(today), "%B %Y")
## [1] "June 2023"
as.yearqtr(today)
## [1] "2023 Q2"

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.