Let us consider a simple example of a small discrete population consisting of the first ten integers $\{1,2,3,4,5,6,7,8,9,10\}$. It is fairly simple to calculate the mean and the standard deviation of the given example. We do that in Python to refresh our knowledge.
# First, let's import all the needed libraries.
import numpy as np
import random
n = 3
population = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pop = np.arange(1, 101, 1)
n_pop = 30
np.mean(population)
5.5
np.std(population)
2.8722813232690143
The population mean, denoted by $\mu$ and the population standard deviation, denoted by $\sigma$ is 5.5 and approximately 2.87, respectively. It is important to realize, that these population parameters will not change! They are fixed.
Let us now take one random sample without replacement of size $n = 3$ from this population. Once again we apply Python to do all the work, by calling the sample
function from the random
library. Recall its form: random.sample(sequence, k)
, with k
length new list of elements chosen from the sequence
.
my_sample = random.sample(population, 3)
my_sample
[7, 4, 6]
Now we calculate the mean and the standard deviation of the given sample. However, this time, as we refer to a particular sample, we call the statistical parameter sample statistic or if we relate to the distribution of values (elements) sample distribution. To make this more explicit, the sample mean is denominated as $\bar x$ and the sample standard deviation as $s$.
x_bar = np.mean(my_sample)
x_bar
5.666666666666667
s = np.std(my_sample)
s
1.247219128924647
The sample mean, $\bar x$, and the sample standard deviation, $s$, is approximately 5.70 and 1.24, respectively. Please note, that depending on the actual elements in the sample, the sample statistics will change from sample to sample.
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.