From the three quartiles (\(Q1, Q2, Q3\)) we can obtain a measure of center (the median, \(Q2\)) and measures of variation of the two middle quarters of the data, \(Q2 - Q1\) for the second quarter and \(Q3 - Q2\) for the third quarter. But the three quartiles do not tell us anything about the variation of the first and fourth quarters.
To gain that information, we include the minimum and maximum observations as well. The variation of the first quarter can be measured as the difference between the minimum and the first quartile, \(Q1 - Min\). The variation of the fourth quarter can be measured as the difference between the third quartile and the maximum, \(Max - Q3\). Thus, the minimum, maximum and quartiles together provide, among other things, information on center and variation (Weiss 2010).
The so called Tukey Five-Number Summary (after the mathematician John Wilder Tukey) of a data set consists of the \(Min\), \(Q1\), \(Q2\), \(Q3\) and \(Max\) of the data set.
The five-number summary is easily calculated by applying the in-built
fivenum()
function in R. For demonstration purposes we
calculate the five-number summary for the nc_score
variable
of the students
data set.
students <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/students.csv")
nc_score <- students$nc.score
fivenum(nc_score)
## [1] 1.00 1.46 2.04 2.78 4.00
This function returns minimum, lower-hinge, median, upper-hinge and maximum for the input data.
In R there exists a similar function called summary()
,
which provides, called on a vector, similar statistics; however,
including the arithmetic mean as well.
summary(nc_score)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.460 2.040 2.166 2.780 4.000
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.