The autoregressive process of order \(p\) is denoted AR(\(p\)), and defined by
\[y_t = \phi_1y_{t-1}+\phi_2y_{t-2}...+\phi_py_{t-p}+w_t\text{,}\]
which can be rewritten as
\[y_t = \sum_{i=1}^p\phi_iy_{t-i}+ w_t\text{,}\]
where \(\phi_1, \phi_2,..., \phi_p\) are fixed constants and \(w_t\) is a random variable with mean 0 and variance \(\sigma^2\).
If the mean, \(\mu\), of \(y_t\) is not zero, we replace \(y_t\) by \(y_t-\mu\):
\[y_t-\mu = \sum_{i=1}^p\phi_i(y_{t-i}-\mu)+ w_t\]
Autoregressive processes have a natural interpretation as the next value observed is a slight perturbation of the most recent observations or, in other words, the current value of the series is linearly dependent upon its previous values, with some random error. This model is called an autoregressive (AR) model since \(y_t\) is regressed on itself.
The special case of \(p = 1\), the first-order process, is also known as Markov process.
We generate two AR(1) series of the form
\(y_t = \phi_1 y_{t-1}+w_t\text{,}\)
where once \(\phi_1 = 0.6\) and the other time \(\phi_1 = -0.6\).
set.seed(250)
ar1_1 <- arima.sim(list(order = c(1, 0, 0), ar = 0.6), n = 100)
ar1_2 <- arima.sim(list(order = c(1, 0, 0), ar = -0.6), n = 100)
We plot the two series using the autolplot()
function.
library(ggfortify)
library(gridExtra)
p1 <- autoplot(ar1_1,
ylab = "y",
main = (expression(AR(1) ~ ~ ~ phi == 0.6))
)
p2 <- autoplot(ar1_2,
ylab = "y",
main = (expression(AR(1) ~ ~ ~ phi == -0.6))
)
grid.arrange(p1, p2, ncol = 1)
For an AR(1) model the current value \(y_t\) of the time series is a function of the last preceding value of the series. We may review the correlational structure by plotting the correlogram.
p1 <- autoplot(acf(ar1_1, plot = FALSE)) +
ggtitle(expression("Serial Correlation " * AR(1) ~ ~ ~ phi == 0.6)) +
xlim(1, 20) +
ylim(-0.5, 0.5)
p2 <- autoplot(acf(ar1_2, plot = FALSE)) +
ggtitle(expression("Serial Correlation " * AR(1) ~ ~ ~ phi == -0.6)) +
xlim(1, 20) +
ylim(-0.5, 0.5)
grid.arrange(p1, p2, ncol = 1)
The autocovariance function for the AR series decays exponentially (for \(\phi >0\)), with a possible sinusoidal variation superimposed (for \(\phi <0\)). Thus, the correlation drops off faster the greater the value of \(k\).
For an AR(2) model, which can be written as
\(y_t = \phi_1 y_{t-1}+\phi_2 y_{t-2}+w_t\text{,}\)
the current value \(y_t\) of the time series is a function of the last two preceding values of the series.
Let us generate four different AR(2) series to review their autocorrelational structures.
We make use of the arima.sim()
function to generate
these four AR(2) models and plot the correlogram by using the
autoplot()
function in combination with the
acf()
function.
ar2_I <- arima.sim(list(
order = c(2, 0, 0),
ar = c(0.5, 0.3)
), n = 100)
ar2_II <- arima.sim(list(
order = c(2, 0, 0),
ar = c(-0.5, 0.3)
), n = 100)
ar2_III <- arima.sim(list(
order = c(2, 0, 0),
ar = c(1, -0.5)
), n = 100)
ar2_IV <- arima.sim(list(
order = c(2, 0, 0),
ar = c(-0.5, -0.3)
), n = 100)
p1 <- autoplot(acf(ar2_I, plot = FALSE, lag.max = 15)) +
ggtitle(expression(AR(2) ~ ~ ~ phi[1] == 0.5 * ~ ~ phi[2] == 0.3))
p2 <- autoplot(acf(ar2_II, plot = FALSE, lag.max = 15)) +
ggtitle(expression(AR(2) ~ ~ ~ phi[1] == -0.5 * ~ ~ phi[2] == 0.3))
p3 <- autoplot(acf(ar2_III, plot = FALSE, lag.max = 15)) +
ggtitle(expression(AR(2) ~ ~ ~ phi[1] == 1 * ~ ~ phi[2] == -0.5))
p4 <- autoplot(acf(ar2_IV, plot = FALSE, lag.max = 15)) +
ggtitle(expression(AR(2) ~ ~ ~ phi[1] == -0.5 * ~ ~ phi[2] == -0.3))
grid.arrange(p1, p3, p2, p4, ncol = 2)
The autocovariance functions exhibit exponential decay, with a possible sinusoidal variation superimposed.
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.