In 1964 G.E.P.
Box and D.R. Cox introduced an important family of transformations
for depended variables in regression analysis for \(y>0\):
\[y^{(\lambda)}= \begin{cases}
\frac{y^{\lambda}-1}{\lambda},&\lambda\ne0 \\
log(y),&\lambda=0
\end{cases}\] Since the analysis of variance is unchanged by
linear transformation, the equation above is equivalent to the
power-transformation:
\[y^{(\lambda)}= \begin{cases}
y^{\lambda},&\lambda\ne0 \\
log(y),&\lambda=0
\end{cases}\]
Now it becomes obvious that \(\lambda=\frac{1}{2}\) will correspond to the square-root-transformation \[y'=\sqrt y\] and for \(\lambda=-1\) the reciprocal transformation:
\[y'=\frac{1}{y}\]
In the case for \(\lambda=\frac{1}{2}\)
resp. \(y'=\sqrt{y}\) the
transformed variable appears less skewed but still positive. A similar
effect results by \(\lambda=-1\) resp.
\(y'=\frac{1}{y}\).
In contrast to power-transformation, the Box-Cox-Transformation
is continuous for \(\lambda=0\) resp.
\(y'=\log y\).
Therefore, let us examine the effect of \(\lambda\to0\) with \(\lambda= (0.7,0.3,0.1,0.01)\):
For \(\lambda<1\) and \(\lambda\to0\), our strong skewed variable
becomes more and more symmetric centered at 0.
But, as far as \(\lambda\ne0\) the
resulting feature scale will keep a lower limit.
Note: only, the logarithm opens the feature scale to \(-\infty\) and \(\infty\) and those becoming unlimited in term of algebraic constraints.
A simple example for log-transformation has already been presented
previously.
Next we will focus on the often occurring case of double
constrained feature scales.
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.