The absolute majority of statistical methods is designed for unlimited rational (\(\mathbb Q\)) or real feature scales (\(\mathbb R\)), thus any number between \(-\infty\) and \(+\infty\) are generally possible. We are expecting meaningful algebraic operations and ideally symmetric or better normal populations.

But nearly all variables are at least one-sided limited to the positive branch of \(\mathbb R\) or \(\mathbb Q\). Hereby, zeros are mostly meaningless and nearly always represent missing values. From an algebraic point of view only multiplication/division cannot leave the feature scale and mislead to estimate insufficient statistical parameters such as arithmetic mean or variances.

A closer look often reveals an additional upper limit of our variables space, mostly related to the physical frame of our planets gravity field and/or limited resources, ecological constraints and other reasons (distances, heights, temperature, counts of individuals, areas, brightness values, etc.).
Furthermore, relative amounts or compositions are leading to spurious and sometimes crazy correlations or artificial patterns.

Therefore we have to “open” our bounded scales in order to perform mathematical/statistical right and meaningful results with regards to scientific contents.
Many transformation are proposed in terms of achieving normality or symmetry. Some of them only change the shapes of distributions whereas other are solving algebraic constraints.
Here, we are presenting linear and non-linear transformation as well as transformation for double constraint feature scales of compositional or physical nature.


Citation

The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.