Box-Cox-, Power- and log-transformation

In 1964 G.E.P. Box and D.R. Cox introduced an important family of transformations for depended variables in regression analysis for \(y>0\):
\[y^{(\lambda)}= \begin{cases} \frac{y^{\lambda}-1}{\lambda},&\lambda\ne0 \\ log(y),&\lambda=0 \end{cases}\] Since the analysis of variance is unchanged by linear transformation, the equation above is equivalent to the power-transformation:
\[y^{(\lambda)}= \begin{cases} y^{\lambda},&\lambda\ne0 \\ log(y),&\lambda=0 \end{cases}\]

Now it becomes obvious that \(\lambda=\frac{1}{2}\) will correspond to the square-root-transformation \[y'=\sqrt y\] and for \(\lambda=-1\) the reciprocal transformation:

\[y'=\frac{1}{y}\]


In the case for \(\lambda=\frac{1}{2}\) resp. \(y'=\sqrt{y}\) the transformed variable appears less skewed but still positive. A similar effect results by \(\lambda=-1\) resp. \(y'=\frac{1}{y}\).

In contrast to power-transformation, the Box-Cox-Transformation is continuous for \(\lambda=0\) resp. \(y'=\log y\).

Therefore, let us examine the effect of \(\lambda\to0\) with \(\lambda= (0.7,0.3,0.1,0.01)\):


For \(\lambda<1\) and \(\lambda\to0\), our strong skewed variable becomes more and more symmetric centered at 0.
But, as far as \(\lambda\ne0\) the resulting feature scale will keep a lower limit.

Note: only, the logarithm opens the feature scale to \(-\infty\) and \(\infty\) and those becoming unlimited in term of algebraic constraints.


A simple example for log-transformation has already been presented previously.
Next we will focus on the often occurring case of double constrained feature scales.


Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.