2052_population_statistics_and_sample

Let us consider a simple example of a small discrete population consisting of the first ten integers $\{1,2,3,4,5,6,7,8,9,10\}$ . It is fairly simple to calculate the mean and the standard deviation of the given example. We do that in Python to refresh our knowledge.

In [2]:

# First, let's import all the needed libraries.
import numpy as np
import random

In [3]:

n = 3
population = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pop = np.arange(1, 101, 1)
n_pop = 30

In [4]:

np.mean(population)

Out[4]:

5.5

In [5]:

np.std(population)

Out[5]:

2.8722813232690143

The population mean, denoted by $\mu$ and the population standard deviation, denoted by $\sigma$ is 5.5 and approximately 2.87, respectively. It is important to realize, that these population parameters will not change! They are fixed.

Let us now take one random sample without replacement of size $n = 3$ from this population. Once again we apply Python to do all the work, by calling the sample function from the random library. Recall its form: random.sample(sequence, k), with k length new list of elements chosen from the sequence.

In [6]:

my_sample = random.sample(population, 3)
my_sample

Out[6]:

[7, 4, 6]

Now we calculate the mean and the standard deviation of the given sample. However, this time, as we refer to a particular sample, we call the statistical parameter sample statistic or if we relate to the distribution of values (elements) sample distribution. To make this more explicit, the sample mean is denominated as $\bar x$ and the sample standard deviation as $s$ .

In [7]:

x_bar = np.mean(my_sample)
x_bar

Out[7]:

5.666666666666667

In [8]:

s = np.std(my_sample)
s

Out[8]:

1.247219128924647

The sample mean, $\bar x$ , and the sample standard deviation, $s$ , is approximately 5.70 and 1.24, respectively. Please note, that depending on the actual elements in the sample, the sample statistics will change from sample to sample.

Citation

The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.