2055_standard

Just as the population distributions can be described with parameters, so can the sampling distribution. The expected value (mean) of any distribution can be represented by the symbol $\mu$ (mu). In the case of the sampling distribution, the mean, $\mu$, is often written with a subscript to indicate which sampling distribution is being described. For example, the expected value of the sampling distribution of the mean is represented by the symbol $\mu_{\bar x}$. The value of $\mu_{\bar x}$ can be thought of as the theoretical mean of the distribution of sample means.

If we pick a large enough number samples (of the same size) from a population and calculate their means, then the mean ($\mu_{\bar x}$ ) of all these sample means will approximate the mean ($\mu$) of the population. That is why the the sample mean $\bar x$, is called an estimator of the population mean, $\mu$. Thus, the mean of the sampling distribution is equal to the mean of the population:

$$\mu_{\bar x} = \mu \, ,$$

The standard deviation of a sampling distribution is given a special name, the standard error. The standard error of the sampling distribution of a statistic, denoted as $\sigma_{\bar x}$, describes the degree to which the computed statistics may be expected to differ from one another when calculated from a sample of similar size and selected from similar population models. The larger the standard error of a given statistic, the greater the differences between the computed statistics for the different samples (Lovric 2011).

However, please note that the standard error, $\sigma_{\bar x}$, is not equal to the standard deviation, $\sigma$, of the population distribution (unless $n = 1$). The standard error is equal to the standard deviation of the population divided by the square root of the sample size:

$$ \sigma_{\bar x} = \frac{\sigma}{\sqrt{n}} \, .$$

This equation holds true only when the sampling is done either with replacement from a finite population or with or without replacement from an infinite population. Which corresponds to the condition that the sample size $(n)$ is small in comparison to the population size $(N)$. The sample size is considered to be small compared to the population size if the sample size is equal to or less than 5% of the population size that is, if

$$ \frac{n}{N} \le 0.05 \, .$$

If this condition is not satisfied, the following equation is used to calculate $\sigma_{\bar x}$:

$$ \sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}\sqrt{\frac{N-n}{N-1} } \, .$$

In most practical applications the sample size is small compared to the population size.

Citation

The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.