Inferences for one population standard deviation are based on the chi-square (\(\chi^2\)) distribution. A \(\chi^2\)-distribution is right-skewed probability density curve. The shape of the \(\chi^2\)-curve is determined by its degrees of freedom \((df)\).

In order to perform a hypothesis test for one population standard deviation, we relate a \(\chi^2\)-value to a specified area under a \(\chi^2\)-curve. Either we consult a \(\chi^2\)-table to look up that value or we make use of the R machinery.

Given \(\alpha\), where \(\alpha\) corresponds to a probability between 0 and 1, \(\chi^2_{\alpha}\) denotes the \(\chi^2\)-value having the area \(\alpha\) to its right under a \(\chi^2\)-curve.


Interval Estimation of \(\sigma\)

The \(100(1-\alpha)\)% confidence interval for \(\sigma\) is

\[\sqrt{\frac{n-1}{\chi^2_{\alpha/2}}} \le \sigma \le \sqrt{\frac{n-1}{\chi^2_{1-\alpha/2}} }\] where where \(n\) is the sample size and \(s\) the standard deviation of the sample data.


One standard deviation \(\chi^2\)-test

The hypothesis testing procedure for one standard deviation is called one standard deviation \(\chi^2\)-test. Hypothesis testing for variances follows the same step-wise procedure as hypothesis tests for the mean. \[ \begin{array}{l} \hline \ \text{Step 1} & \text{State the null hypothesis } H_0 \text{ and alternative hypothesis } H_A \text{.}\\ \ \text{Step 2} & \text{Decide on the significance level, } \alpha\text{.} \\ \ \text{Step 3} & \text{Compute the value of the test statistic.} \\ \ \text{Step 4} &\text{Determine the p-value.} \\ \ \text{Step 5} & \text{If } p \le \alpha \text{, reject }H_0 \text{; otherwise, do not reject } H_0 \text{.} \\ \ \text{Step 6} &\text{Interpret the result of the hypothesis test.} \\ \hline \end{array} \] The test statistic for a hypothesis test with the null hypothesis \(H_0: \,\sigma = \sigma_0\) for a normally distributed variable is given by

\[\chi^2 = \frac{n-1}{\sigma^2_0}s^2 \]

The variable follows a \(\chi^2\)-distribution with \(n - 1\) degrees of freedom.

Be aware, that the one standard deviation \(\chi^2\)-test is not robust against violations of the normality assumption (Weiss 2010).


One-standard-deviation \(\chi^2\)-test: An example

In order to get some hands-on experience we apply the one standard deviation \(\chi^2\)-test in an exercise. Therefore we load the students data set. You may download the students.csv file here. Import the data set and assign a proper name to it.

students <- read.csv("https://userpage.fu-berlin.de/soga/200/2010_data_sets/students.csv")

The students data set consists of 8239 rows, each of them representing a particular student, and 16 columns, each of them corresponding to a variable/feature related to that particular student. These self-explaining variables are: stud.id, name, gender, age, height, weight, religion, nc.score, semester, major, minor, score1, score2, online.tutorial, graduated, salary.

In order to showcase the one standard deviation \(\chi^2\)-test we examine the spread of the height in cm of female students and compare it to the spread of the height of the all students (our population). We want to test, if the standard deviation of the height of female students is less than standard deviation of the height of all students.


Data preparation

We start with data preparation.

sigma0 <- sd(students$height)
sigma0
## [1] 11.07753

The standard deviation of the population of interest (\(\sigma_0\)) is \(\approx\) 11.08 cm.

female <- subset(students, gender=='Female')

n <- 30
female.sample <- sample(female$height, n)
sample.sd <- sd(female.sample)

Further, we check the normality assumption by plotting a Q-Q plot. In R we apply the qqnorm() and the qqline() functions for plotting Q-Q plots.

par(mar = c(5,5,4,2))

# Sample data 
qqnorm(female.sample, main = 'Q-Q plot for weight of\n sampled female students', cex.main = 0.9)
qqline(female.sample, col = 3, lwd = 2)