So far, we have focused on hypothesis tests for one population mean. However, we want to compare the means of two or more populations in many applications. Therefore, we first have to distinguish between samples from two populations that are independent) and samples from populations that are not independent, which are called paired samples. In the following sections, we discuss inferential procedures for comparing the means of two populations.
We will denote the parameters and the statistics of populations 1 and 2 with subscripts 1 and 2, respectively. Thus, $\mu_{1}$ and $\sigma_{1}$ are population parameters of population 1 and $\mu_{2}$ and $\sigma_{2}$, those of population 2. Similarly, $\bar {x}_{1}$, $s_{1}$ and $n_{1}$ are the sample mean, the sample standard deviation and the sample size of population 1, whereas $\bar {x}_{2}$, $s_{2}$ and $n_{2}$ correspond to the sample drawn from population 2.
For independent samples of sizes, $n_{1}$ and $n_{2}$ of population 1 and population 2 the mean of all possible differences between the two sample means equals the difference between the two population means:
$$\mu_{\bar{x}_{1} - \bar{x}_{2}} = \mu_{1} - \mu_{2}$$Further, the standard deviation of all possible differences between the two sample means equals the square root of the sum of the population variances, each divided by the corresponding sample size:
$$\sigma_{\bar{x}_{1} - \bar{x}_{2}} = \sqrt {\frac {\sigma^{2}_{1}} {n_{1}} + \frac {\sigma^{2}_{2}} {n_{2}}}$$A normally distributed variable or a large enough sample size (recall the central limit theorem) causes the difference of the sample means ($\bar{x}_{1} - \bar{x}_{2}$) to be normally distributed, too.
Hypothesis testing procedures for two population means are actually the same as for one population mean. Please note that in the following sections, we focus on the p-value approach and no longer discuss the critical value approach. Therefore, the hypothesis testing procedure is slightly revised. The step-wise procedure for hypothesis tests is summarized as follows:
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.