20730_hypothesis_tests_for_two_population

So far we focused on hypothesis tests for one population mean. However, in many applications we want to compare the means of two or more populations. In the following sections we discuss inferential procedures for comparing the means of two populations. Therefore, we first have to distinguish between samples from two populations that are independent and samples from populations that are not independent, which are called paired samples.

In the following sections we denote the parameters and the statistics of population 1 and population 2 with the subscript 1 and 2, respectively. Thus, $\mu_1$ and $\sigma_1$ are population parameters of population 1 and $\mu_2$ and $\sigma_2$ , those of population 2. Similarly, $\bar x_1$ , $s_1$ and $n_1$ are the sample mean, the sample standard deviation and the sample size of population 1, whereas $\bar x_2$ , $s_2$ and $n_2$ correspond to the sample drawn from population 2.

For independent samples of sizes $n_1$ and $n_2$ of population 1 and population 2 the mean of all possible differences between the two sample means equals the difference between the two population means:

$\mu_{\bar x_1 -\bar x_2} = \mu_1-\mu_2$ Further, the standard deviation of all possible differences between the two sample means equals the square root of the sum of the population variances each divided by the corresponding sample size:

$\sigma_{\bar x_1 -\bar x_2} = \sqrt{\frac{\sigma^2_1}{n_1} + \frac{\sigma^2_2}{n_2}}$ A normally distributed variable or a large enough sample size (recall the central limit theorem) causes the difference of the sample means ( $\bar x_1 - \bar x_2$ ) to be normally distributed, too.

Hypothesis testing procedures for two population means are actually the same as for one population mean. Please note, that in the following sections we focus on the p-value approach and do not discuss the critical value approach anymore. Therefore, the hypothesis testing procedure is slightly revised. The step-wise procedure for hypothesis tests is summarized as follows:

$\begin{array}{l} \hline \ \text{Step 1} & \text{State the null hypothesis, } H_0 \text{, and alternative hypothesis, } H_A \text{.}\\ \ \text{Step 2} & \text{Decide on the significance level, } \alpha\text{.} \\ \ \text{Step 3} & \text{Compute the value of the test statistic.} \\ \ \text{Step 4} &\text{Determine the p-value.} \\ \ \text{Step 5} & \text{If } p \le \alpha \text{, reject }H_0 \text{; otherwise, do not reject } H_0 \text{.} \\ \ \text{Step 6} &\text{Interpret the result of the hypothesis test.} \\ \hline \end{array}$

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.