From the three quartiles (\(Q1, Q2, Q3\)), we can obtain a measure of center (the median, \(Q2\)) and measures of variation of the two middle quarters of the data, \(Q2 - Q1\) for the second quarter and \(Q3 - Q2\) for the third quarter. But the three quartiles do not tell us anything about the variation of the first and fourth quarters.

To gain that information, we include the minimum and maximum observations as well. The variation of the first quarter can be measured as the difference between the minimum and the first quartile, \(Q1 - Min\), and the variation of the fourth quarter can be measured as the difference between the third quartile and the maximum, \(Max - Q3\). Thus the minimum, maximum, and quartiles together provide, among other things, information on center and variation (Weiss 2010).

The so called Tukey Five-Number Summary (after the mathematician John Wilder Tukey) of a data set consists of the \(Min\), \(Q1\), \(Q2\), \(Q3\), and \(Max\) of the data set.

The five-number summary is easily calculated by applying the in-built fivenum() function in R. For demonstration purposes we calculate the five-number summary for nc.score variable

students <- read.csv("https://userpage.fu-berlin.de/soga/200/2010_data_sets/students.csv")
nc.score <- students$nc.score
fivenum(nc.score)
## [1] 1.00 1.46 2.04 2.78 4.00

This function returns minimum, lower-hinge, median, upper-hinge and maximum for the input data.

In R there exists an similar function called summary(), which provides, called on a vector, similar statistics; however, including the arithmetic mean as well.

summary(nc.score)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.460   2.040   2.166   2.780   4.000