The range as a measure of dispersion is simple to calculate. It is obtained by taking the difference between the largest and the smallest value of a data set.
\[ \text{Range} = \text{Largest value} - \text{Smallest value} \]
Let us consider our students
data set. We subset the
data frame to include numerical data only.
students <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/students.csv")
quant_vars <- c("age", "nc.score", "height", "weight")
students_quant <- students[quant_vars]
head(students_quant, 10)
## age nc.score height weight
## 1 19 1.91 160 64.8
## 2 19 1.56 172 73.0
## 3 22 1.24 168 70.6
## 4 19 1.37 183 79.7
## 5 21 1.46 175 71.4
## 6 19 1.34 189 85.8
## 7 21 1.11 156 65.9
## 8 21 2.03 167 65.7
## 9 18 1.29 195 94.4
## 10 18 1.19 165 66.0
We use the range
function, which returns a vector
containing the minimum and maximum of all the given arguments, in
combination with the apply
function to calculate the
minimum and maximum for each particular variable, respectively column,
of the data set.
range_studs <- apply(students_quant, 2, range)
range_studs
## age nc.score height weight
## [1,] 18 1 135 51.4
## [2,] 64 4 206 116.0
Now, to calculate the range for each variable, we just have to subtract one row from another.
range_studs[2, ] - range_studs[1, ]
## age nc.score height weight
## 46.0 3.0 71.0 64.6
The range, like the mean, has the disadvantage of being influenced by outliers. Consequently, the range is not a good measure of dispersion to use for a data set that contains outliers. Another disadvantage of using the range as a measure of dispersion is that its calculation is based on two values only: the largest and the smallest. All other values in a data set are ignored when calculating the range. Thus, the range is not a very satisfactory measure of dispersion (Mann 2012).
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.