20451_students_t-distribution_in

The R software provides access to the t-distribution by the dt(), pt(), qt() and rt() functions. Apply the help() function on these functions for further information.

The rt() function generates random deviates of the t-distribution and is written as rt(n, df). We may easily generate a number of n random samples. Recall that the number of degrees of freedom for a t-distribution is equal to the sample size minus one, that is:

\[df = n - 1\text{.}\]

# generate n=30 random samples
n <- 30
df <- n - 1
rt(n, df)

##  [1]  0.31952261  0.06290553 -0.35871793 -1.32967415 -0.05165391 -0.90241967
##  [7] -0.54105759  0.36982695  0.30713892 -0.73883108  0.46213888  1.57162077
## [13] -0.46133279 -0.81859808  0.29733458 -1.61909097  0.60977141  2.28804954
## [19]  1.38906571  1.33296659 -0.95001223 -0.69176073  0.02499963 -1.31367313
## [25] -1.21794654 -1.29910772  0.42969864 -2.87398746 -0.41642019 -0.61221890

Further, we may generate a very large number of random samples and plot them as a histogram:

# generate n=10000 random samples
n <- 10000
df <- n - 1
samples <- rt(n, df)
hist(samples, breaks = "Scott", freq = FALSE)

Using the dt() function we may calculate the probability density function and thus, the vertical distance between the horizontal axis and the t-curve at any point. For the purpose of demonstration we construct a t-distribution with \(df=5\) and calculate the probability density function at \(t = -4,-2,0,2,4\).

x <- seq(-4, 4, by = 2)
dt(x, df = 5)

## [1] 0.005123727 0.065090310 0.379606690 0.065090310 0.005123727

Another very useful function is the pt() function, which returns the area under the t-curve for any given interval. Let us calculate the area under the curve for the intervals \(j_i = (-\infty, -2], (-\infty, 0], (-\infty, 2]\) and \(k_i = [-2, \infty),[0, \infty), [2, \infty)\) for a random variable following a t-distribution with \(df=5\).

df <- 5
ji <- c(-2, 0, 2)
pt(ji, df = df, lower.tail = TRUE)

## [1] 0.05096974 0.50000000 0.94903026

df <- 5
ki <- c(-2, 0, 2)
pt(ki, df = df, lower.tail = FALSE)

## [1] 0.94903026 0.50000000 0.05096974

The qt() function returns the quantile function and thus is the reverse function of pt(). For the intervals \(j_i = (-\infty, -2], (-\infty, 0], (-\infty, 2]\) of a random variable following a t-distribution with \(df=5\), the qt() function yields:

x <- seq(-2, 2, by = 2)
ji <- pt(x, df = 5, lower.tail = TRUE)
ji

## [1] 0.05096974 0.50000000 0.94903026

qt(ji, df = 5, lower.tail = TRUE)

## [1] -2  0  2

For the intervals \(k_i = [-2, \infty),[0, \infty), [2, \infty)\) of a random variable following a t-distribution with \(df=5\), the qt() function returns:

x <- seq(-2, 2, by = 2)
ki <- pt(x, df = 5, lower.tail = FALSE)
ki

## [1] 0.94903026 0.50000000 0.05096974

qt(ki, df = 5, lower.tail = FALSE)

## [1] -2  0  2

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.