A very common problem that scientists face is the assessment of significance in scattered statistical data. Owing to the limited availability of observational data, scientists apply inferential statistical methods to decide whether the observed data contains significant information or the scattered data is nothing more than the manifestation of the inherently probabilistic nature of the data generation process.
Generally speaking, a scientist states such a problem as follows. The scientist builds a model that simplifies the data generation process and considers a particular assumption - a so-called hypothesis - of this model. Given the data, he wants to evaluate this tentative hypothesis.
The framework of hypothesis testing is all about making statistical inferences about populations based on samples taken from the population. One way to estimate a population parameter is the construction of confidence intervals. Another way is to make a decision about a parameter in form of a test. Any hypothesis test involves the collection of data (sampling). If the hypothesis is assumed to be correct, the scientist can calculate the expected results of an experiment. If the observed data differs significantly from the expected results, one considers the assumption to be incorrect. Thus, based on the observed data, the scientist decides whether there is sufficient evidence, based upon analyses of the data, that the model - the hypothesis - should be rejected or that there is insufficient evidence to reject the stated hypothesis.
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.