Beside the STL decomposition discussed in the previous section there exit a variety of ways to remove trends and seasonal components from time series data.
Smoothing
In the previous sections we already discussed some widely used methods, such as moving average, kernel smoothing, smoothing via local polynomials, lowess, and smoothing splines. The general idea is to smooth out the noise in a time series and pick up the overall trend in the data. Subtracting the trend from the original time series data should yield a time series without the trend.
Least Squares Estimation
Another method to eliminate trends is least squares estimation.
Consider the model
\[y_t = \mu_t + w_t\text{,}\] where the trend is given by \(\mu_t\).
Least squares estimation fits a polynomial regression in \(t\) to the data:
\[\mu_t = \beta_0+\beta_1t+\beta_2t^2+...+\beta_pt^p\text{.}\]
If the time series shows a linear trend, then \(p = 1\). The residuals that result from the fit yield a time series without the trend.
Moreover, we may fit a periodic effect by a linear model with indicator variables for the different months:
\[y_t = \beta_1m_{1t}+\beta_2m_{2t}+...+\beta_{12}m_{12t} + w_t \]
where \(m_{kt}\) is \([0, 1]\) indicator for month \(k\).
Another way of modeling seasonality is to fit a regression model that is periodic.
\[y_t=\beta_0+\beta_1\sin((2\pi/m)t)+\beta_2\cos((2\pi/m)t) +w_t \]
If the time series under investigation is of monthly step size, replacing \(m\) with the value of 12 in the equation above allows us to fit a model with seasonal periodicity.
Differencing
Another way of removing a trend in data is by differencing. The first difference operator \(\Delta\) is defined by
\[\Delta y_t = y_t-y_{t-1}\]
Higher powers of the difference operator are defined as
\[ \begin{aligned} \Delta^2 y_t & = \Delta (\Delta y_t) \\ & = \Delta (y_t-y_{t-1}) \\ & = (y_t-y_{t-1}) - (y_{t-1}-y_{t-2}) \\ & = y_t-2y_{t-1}+y_{t-2}\text{,} \end{aligned} \] and so on. If the difference operator \(\Delta\) is applied to a time series with a linear trend
\[y_t=\beta_0+\beta_1t+w_t\text{,}\]
then
\[ \begin{aligned} \Delta^2 y_t & = y_t-y_{t-1} \\ & = (\beta_0+\beta_1t+w_t)-(\beta_0+\beta_1(t-1)+w_{t-1})\\ & = \beta_1 + w_t-w_{t-1}\text{,} \end{aligned} \] which yields a time series with a constant mean. Similarly, we can use \(\Delta^2y_t\) to eliminate a quadratic trend, \(\Delta^3y_t\) to eliminate a cubic trend and so on.
We can even extend the concept of differencing to include seasonal differencing:
\[\Delta_m(y_t)=y_t-y_{t-m}\]
If the time series under investigation is of monthly step size, then \(m = 12\) corresponds to a 12 month seasonal effect.
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.