If the population standard deviation (\(\sigma\)) is known, a hypothesis test performed for one population mean is called one-mean \(z\)-test or simply \(z\)-test.

A \(z\)-test is a hypothesis test for testing a population mean, \(\mu\), against a supposed population mean, \(\mu_0\). The \(z\)-test assumes normally distributed variables or a large sample size; then the central limit theorem guarantees a normally distributed sampling distribution. In addition, the standard deviation of the population, \(\sigma\), must be known. In real life applications this assumption is almost never fulfilled. Thus, the \(z\)-test is rarely applied. However, it is the simplest hypothesis test and thus a good subject to start with.

To perform the \(z\)-test we follow the step wise procedure shown in the table below. First, we showcase the critical value approach, then, in a second step, we repeat the analysis for the p-value approach.

\[ \begin{array}{l} \hline \ \text{Step 1} & \text{State the null hypothesis } H_0 \text{ and alternative hypothesis } H_A \text{.}\\ \ \text{Step 2} & \text{Decide on the significance level, } \alpha\text{.} \\ \ \text{Step 3} & \text{Compute the value of the test statistic.} \\ \ \text{Step 4a} & \text{Critical value approach: Determine the critical value.} \\ \ \text{Step 4b} &\text{P-value approach: Determine the p-value.} \\ \ \text{Step 5a} & \text{Critical value approach: If the value of the test statistic falls in the rejection region, reject } H_0 \text{; otherwise, do not reject } H_0 \text{.} \\ \ \text{Step 5b} & \text{P-value approach: If } p \le \alpha \text{, reject }H_0 \text{; otherwise, do not reject } H_0 \text{.} \\ \ \text{Step 6} &\text{Interpret the result of the hypothesis test.} \\ \hline \end{array} \]


One-mean \(z\)-test: An example

In this section we work with the students data set. You may download the students.csv file here. Import the data set and assign a proper name to it:

students <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/students.csv")

The students data set consists of 8239 rows, each of them representing a particular student, and 16 columns, each of them corresponding to a variable/feature related to that particular student. These self-explaining variables are: stud.id, name, gender, age, height, weight, religion, nc.score, semester, major, minor, score1, score2, online.tutorial, graduated, salary.

In order to showcase hypothesis testing we examine the average weight of students and compare it to the average weight of Europeans adults. Walpole et al. (2012) published data on the average body mass (kg) per region, including Europe. They report the average body mass for the European adult population to be 70.8 kg. We set the population mean accordingly: \(\mu_0 = 70.8\). Unfortunately, owing to the methodological approach of Walpole et al. (2012) they did not provide a standard deviation (\(\sigma\)) of the weights of European adults. For demonstration purposes we assume that the weight data given in the students data set is a good approximation for the population of interest. Thus, we set \(\sigma\) to the standard deviation of the weight variable in the students data set:

mu0 <- 70.8
sigma <- sd(students$weight)
sigma
## [1] 8.635162

Further, we take one random sample with a sample size of \(n=14\). The sample consists of the weights in kg of 14 randomly picked students from the students data set. Finally, we calculate the sample mean (\(\bar x\)), which is our sample statistic of interest. The sample statistic is assigned to the variable x_bar.

n <- 14
# random sampling
x_weight <- sample(x = students$weight, size = n)
# calculate sample mean
x_bar <- mean(x_weight)
x_bar
## [1] 74.89286

Hypothesis testing: The critical value approach

Step 1: State the null hypothesis, \(H_0\), and alternative hypothesis, \(H_A\).

The null hypothesis states that the average weight of students (\(\mu\)) equals the average weight of European adults of 70.8 kg (\(\mu_0\)) as reported by Walpole et al. (2012). In other words, there is no difference in the mean weight of students and the mean weight of European adults.

\[H_0: \quad \mu = 70.8 \]

For the purpose of illustration we test three alternative hypotheses:

Alternative hypothesis 1: The average weight of students does not equal the average weight of European adults. In other words, there is a difference in the mean weight of students and the mean weight of European adults. \[H_{A_1}: \quad \mu \ne 70.8 \]

Alternative hypothesis 2: The average weight of students is less than the average weight of European adults. \[H_{A_2}: \quad \mu < 70.8 \]

Alternative hypothesis 3: The average weight of students is larger than the average weight of European adults. \[H_{A_3}: \quad \mu > 70.8 \]


Step 2: Decide on the significance level, \(\alpha\).

\[\alpha = 0.05\]

alpha <- 0.05

Step 3: Compute the value of the test statistic.

The following equation is applied to calculate the test statistic \(z\):

\[z = \frac{\bar x-\mu_0}{\sigma/\sqrt{n}}\]

For our example:

\[z = \frac{\bar x-\mu_0}{\sigma/\sqrt{n}} = \frac{74.89-70.8}{8.64/ \sqrt{14}} \approx 1.77 \]

z <- (x_bar - mu0) / sigma * sqrt(n)
z
## [1] 1.773455

Step 4a: Determine the critical value.

In order to calculate the critical value we apply the qnorm() function in R. Recall, that we test for three alternative hypotheses (\(H_{A_1}\), \(H_{A_2}\) and \(H_{A_3}\)). Thus, we have to calculate three critical values as well (\(z_{A_1} = \pm z_{\alpha/2}\), \(z_{A_2} = -z_\alpha\) and \(z_{A_3} = +z_\alpha\)).

z_ha1 <- qnorm(1 - alpha / 2)
z_ha2 <- qnorm(1 - alpha, lower.tail = FALSE)
z_ha3 <- qnorm(1 - alpha)

The critical values are \(z_{A_1} \approx \pm 1.96\), \(z_{A_2} \approx -1.64\) and \(z_{A_3} \approx 1.64\).


Step 5a: If the value of the test statistic falls in the rejection region, reject \(H_0\); otherwise, do not reject \(H_0\).

The value of the test statistic found in Step 3 is \(z \approx 1.77\). Recall, that we are investigating three alternative hypotheses (\(H_{A_1}\),\(H_{A_2}\) and \(H_{A_3}\)). Thus, we evaluate the rejection region for each particular hypothesis.

Recall the critical values for \(H_{A_1}\): \[z_{A_1} = \pm z_{\alpha/2} = \pm 1.96\]

Does the test statistic (\(z \approx 1.77\)) fall in the rejection region? Be aware that the test is two tailed, meaning we evaluate the upper and the lower limit:

\[1.77 > 1.96\]

# upper limit
# Reject?
print(z > abs(z_ha1))
## [1] FALSE

\[1.77 < -1.96\]

# lower limit
# Reject?
print(z < -abs(z_ha1))
## [1] FALSE

Based on the numerical and graphical evaluation the value does not fall in the rejection region, so we do not reject \(H_0\). The test results are statistically significant at the 5 % level.

Recall the critical value for \(H_{A_2}\): \[z_{A_2} = -z_{\alpha} = -1.64\]

Does the test statistic (\(z \approx 1.77\)) fall in the rejection region?

\[1.77 < -1.64\]

# Reject?
print(z < z_ha2)
## [1] FALSE

Based on the numerical and graphical evaluation the value does not fall in the rejection region, so we do not reject \(H_0\). The test results are statistically significant at the 5 % level.

Recall the critical value for \(H_{A_3}\): \[z_{A_3} = +z_{\alpha} = 1.64\]

Does the test statistic (\(z \approx 1.77\)) fall in the rejection region?

\[1.77 > 1.64\]

# Reject?
print(z > z_ha3)
## [1] TRUE

Based on the numerical and graphical evaluation the value does fall in the rejection region, so we reject \(H_0\). The test results are statistically significant at the 5 % level.


Step 6: Interpret the result of the hypothesis test.

At the 5 % significance level, the data does not provide sufficient evidence to conclude that the average weight of students differs from the average weight of European adults.

At the 5 % significance level, the data does not provide sufficient evidence to conclude that the average weight of students is less than the average weight of European adults.

At the 5 % significance level, the data provides sufficient evidence to conclude that the average weight of students is larger than average weight of European adults.


Hypothesis testing: The p-value approach

Step 1: State the null hypothesis, \(H_0\), and alternative hypothesis, \(H_A\).

The null hypothesis states that the average weight of students (\(\mu\)) equals the average weight of European adults of 70.8 kg (\(\mu_0\)) as reported by Walpole et al. (2012). In other words, there is no difference in the mean weight of students and the mean weight of European adults.

\[H_0: \quad \mu = 70.8 \]

For the purpose of illustration we test three alternative hypotheses:

Alternative hypothesis 1: The average weight of students does not equal the average weight of European adults. In other words, there is a difference in the mean weight of students and the mean weight of European adults. \[H_{A_1}: \quad \mu \ne 70.8 \]

Alternative hypothesis 2: The average weight of students is less than the average weight of European adults. \[H_{A_2}: \quad \mu < 70.8 \]

Alternative hypothesis 3: The average weight of students is larger than the average weight of European adults. \[H_{A_3}: \quad \mu > 70.8 \]


Step 2: Decide on the significance level, \(\alpha\).

\[\alpha = 0.05\]

alpha <- 0.05

Step 3: Compute the value of the test statistic.

The following equation is applied to calculate the test statistic \(z\):

\[z = \frac{\bar x-\mu_0}{\sigma/\sqrt{n}}\]

For our example:

\[z = \frac{\bar x-\mu_0}{\sigma/\sqrt{n}} = \frac{74.89-70.8}{8.64/ \sqrt{14}} \approx 1.77 \]

z <- (x_bar - mu0) / sigma * sqrt(n)
z
## [1] 1.773455

Step 4b: Determine the p-value.

In order to calculate the p-value we apply the pnorm() function in R. Recall, that we test for three alternative hypothesis (\(H_{A_1}\), \(H_{A_2}\) and \(H_{A_3}\)). Thus, we calculate three p-values as well \((P(z_{A_1})\), \(P(z_{A_2})\) and \(P(z_{A_3}))\).

# the probability of observing any z-value greater or smaller in magnitude given the null hypothesis is true
upper <- pnorm(abs(z), lower.tail = FALSE)
lower <- pnorm(-abs(z), lower.tail = TRUE)
p_z_1 <- upper + lower
p_z_1
## [1] 0.07615337

From Step 3, the value of the test statistic is \(z \approx 1.77\). The test is two-sided, so the p-value is the probability of observing a value \(z\) of 1.77 or greater in magnitude, or a value \(z\) of -1.77 or lower in magnitude. That probability corresponds to the colored area in the figure below, which is 2 x 0.04 or 0.076. Thus, \(p \approx 0.076\).

# the probability of observing a z-value smaller in magnitude given the null hypothesis is true
p_z_2 <- pnorm(z, lower.tail = TRUE)

From Step 3, the value of the test statistic is \(z \approx 1.77\). The test is left-tailed, so the p-value is the probability of observing a value \(z\) of 1.77 or lower in magnitude. That probability corresponds to the colored area in the figure below, which is 0.962. Thus, \(p \approx 0.962\).

# the probability of observing a z-value greater in magnitude given the null hypothesis is true
p_z_3 <- pnorm(z, lower.tail = FALSE)
p_z_3
## [1] 0.03807669

From Step 3, the value of the test statistic is \(z = 1.77\). The test is right-tailed, so the p-value is the probability of observing a value \(z\) of 1.77 or greater in magnitude. That probability corresponds to the colored area in the figure below, which is 0.038. Thus, \(p \approx 0.038\).


Step 5b: If \(p \le \alpha\), reject \(H_0\); otherwise, do not reject \(H_0\).

The p-value of the test statistic found in Step 4 is compared to the user-defined significance level \(\alpha\) of 5 %. Recall, that we are investigating three alternative hypothesis (\(H_{A_1}\),\(H_{A_2}\) and \(H_{A_3}\)). Thus, we make comparisons for each particular hypothesis.

\[0.076 \le 0.05 \]

# Reject?
p_z_1 <= alpha
## [1] FALSE

The p-value is greater than the specified significance level of 0.05. We do not reject \(H_0\). The test results are statistically significant at the 5 % level and do not provide sufficient evidence against the null hypothesis.

\[0.962 \le 0.05 \]

# Reject?
p_z_2 <= alpha
## [1] FALSE

The p-value is greater than the specified significance level of 0.05. We do not reject \(H_0\). The test results are statistically significant at the 5 % level and do not provide sufficient evidence against the null hypothesis.

\[0.038 \le 0.05 \]

# Reject?
p_z_3 <= alpha
## [1] TRUE

The p-value is less than the specified significance level of 0.05. We reject \(H_0\). The test results are statistically significant at the 5 % level and provide moderate evidence against the null hypothesis.


Step 6: Interpret the result of the hypothesis test.

\(p \approx 0.076\). At the 5 % significance level the data does not provide sufficient evidence to conclude, that the average weight of students differs from the average weight of European adults.

\(p \approx 0.962\). At the 5 % significance level the data does not provide sufficient evidence to conclude, that the average weight of students is less than the average weight of European adults.

\(p \approx 0.038\). At the 5 % significance level the data provides strong evidence to conclude, that the average weight of students is higher than average weight of European adults.


Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.