Another very important measure of central tendency is the median. The median is the value of the middle term in a data set that has been ranked in increasing order. Thus, the median divides a ranked data set into two equal parts.
The calculation of the median consists of the following two steps:
Note that if the number of observations in a data set is odd, then the median is given by the value of the middle term in the ranked data. However, if the number of observations is even, then the median is given by the average of the values of the two middle terms (Mann 2012) .
Let us evaluate the median for the age
variable of the
students
data set.
students <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/students.csv")
stud_age <- students$age # extract age vector
hist(stud_age,breaks = 30,xlim = c(min(stud_age),max(stud_age)))
By plotting the age
variable we immediately realize that
there are some students, which are much older than the rest of the
students.
Let us calculate the median…
median(stud_age)
## [1] 21
…and compare it to the arithmetic mean.
mean(stud_age)
## [1] 22.54157
Now, for visualization we add the median and the arithmetic mean to the plot.
hist(stud_age,breaks = 30,xlim = c(min(stud_age),max(stud_age))) # plot figure
abline(
v = mean(stud_age),
col = "red",
lwd = 3
) # add horizontal line
abline(
v = median(stud_age),
col = "green",
lwd = 3
) # add horizontal line
legend("topright",
legend = c("Median", "Arithmetic mean"),
col = c("green", "red"),
lty = "solid"
) # add legend
As we can see, the median is not influenced by the outliers. Consequently, the median is preferred over the mean as a measure of central tendency for data sets that contain outliers (Mann 2012) .
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.