2055_standard_error.knit

Just as the population distribution can be described with certain parameters, so can the sampling distribution. The expected value (mean) of any distribution can be represented by the symbol \(\mu\) (mu). In the case of the sampling distribution, the mean \(\mu\) is often written with a subscript to indicate which sampling distribution is being described. For example, the expected value of the sampling distribution of the mean is represented by the symbol \(\mu_{\bar x}\). The value of \(\mu_{\bar x}\) can be thought of as the theoretical mean of the distribution of sample means.

If we pick a large enough number of samples (of the same size) from a population and calculate their means, then the mean (\(\mu_{\bar x}\) ) of all these sample means will approximate the mean (\(\mu\)) of the population. That is why the sample mean \(\bar x\), is called an estimator of the population mean \(\mu\). Thus, the mean of the sampling distribution is equal to the mean of the population:

\[\mu_{\bar x} = \mu\]

The standard deviation of a sampling distribution is given a special name, the standard error. The standard error of the sampling distribution of a statistic, denoted as \(\sigma_{\bar x}\), describes the degree to which the computed statistic may be expected to differ from another one calculated from a sample of similar size and selected from similar population model. The larger the standard error of a given statistic, the greater the differences between the computed statistics for the different samples (Lovric 2010).

Note: The standard error, \(\sigma_{\bar x}\), is not equal to the standard deviation, \(\sigma\), of the population distribution (unless \(n = 1\)).

The standard error is equal to the standard deviation of the population divided by the square root of the sample size:

\[ \sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}\]

The equation holds true only when the sampling is done either with replacement from a finite population, or with or without replacement from an infinite population. This corresponds to the condition that the sample size \((n)\) is small in comparison to the population size \((N)\). The sample size is considered to be small compared to the population size, if the sample size is equal to or less than 5 % of the population size, so if

\[ \frac{n}{N} \le 0.05.\]

If this condition is not satisfied, the following equation is used to calculate \(\sigma_{\bar x}\):

\[ \sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}\sqrt{\frac{N-n}{N-1} }\]

In most practical applications, however, the sample size is small compared to the population size.

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.