The range as a measure of dispersion is simple to calculate. It is obtained by taking the difference between the largest and the smallest values in a data set.

\[ \text{Range} = \text{Largest value} - \text{Smallest value} \]

Let us consider our students data set. We subset the data frame to include numerical data only.

students <- read.csv("https://userpage.fu-berlin.de/soga/200/2010_data_sets/students.csv")
quant.vars <- c("age", "nc.score", "height", "weight")
students.quant <- students[quant.vars]
head(students.quant, 10)
##    age nc.score height weight
## 1   19     1.91    160   64.8
## 2   19     1.56    172   73.0
## 3   22     1.24    168   70.6
## 4   19     1.37    183   79.7
## 5   21     1.46    175   71.4
## 6   19     1.34    189   85.8
## 7   21     1.11    156   65.9
## 8   21     2.03    167   65.7
## 9   18     1.29    195   94.4
## 10  18     1.19    165   66.0

We use the range function, which returns a vector containing the minimum and maximum of all the given arguments, in combination with the apply function to calculate the minimum and maximum for each particular variable, respectively column, of the data set.

apply(students.quant, 2, range)
##      age nc.score height weight
## [1,]  18        1    135   51.4
## [2,]  64        4    206  116.0

Now, to calculate the range for each variable we just have to subtract one row from another.

range.studs <- apply(students.quant, 2, range)
range.studs[2,] - range.studs[1,]
##      age nc.score   height   weight 
##     46.0      3.0     71.0     64.6

The range, like the mean, has the disadvantage of being influenced by outliers. Consequently, the range is not a good measure of dispersion to use for a data set that contains outliers. Another disadvantage of using the range as a measure of dispersion is that its calculation is based on two values only: the largest and the smallest. All other values in a data set are ignored when calculating the range. Thus, the range is not a very satisfactory measure of dispersion (Mann 2012).