**Quartiles** divide a ranked data set into **four
equal parts**. These three measures are denoted **first
quartile (denoted by \(Q1\))**,
**second quartile (denoted by \(Q2\))** and **third quartile
(denoted by \(Q3\))**. The
second quartile is the same as the median of a data set. The first
quartile is the value of the middle term among the observations that are
less than the median and the third quartile is the value of the middle
term among the observations that are greater than the median (Mann 2012).

Approximately 25 % of the values in a ranked data set are less than
\(Q1\) and about 75 % are greater than
\(Q1\) The second quartile, \(Q2\), divides a ranked data set into two
equal parts; hence, the second quartile and the median are the same.
Approximately 75 % of the data values are less than \(Q3\) and about 25 % are greater than \(Q3\). The difference between the third
quartile and the first quartile of a data set is called the
**interquartile range (\(IQR\))** (Mann 2012).

\[ IQR = Q3-Q1\]

Let us switch to R and test its functionality for computing
quantiles/quartiles. We will use the `nc.score`

variable of
the `students`

data set to calculate quartiles and the \(IQR\). The `nc.score`

variable
corresponds to the Numerus Clausus score of each particular
student.

First, we subset the data and plot a histogram to further inspect the variable’s distribution.

```
students <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/students.csv")
nc_score <- students$nc.score
hist(nc_score, breaks = "sturges")
```

To calculate the quartiles for the `nc_score`

variable, we
apply the function `quantile()`

. If you call the
`help()`

function on `quantile()`

, you see that
the default values for the argument `probs`

are set to 0,
0.25, 0.5 and 0.75. Thus, in order to calculate the quartiles for the
`nc_score`

variable we just write:

`quantile(nc_score)`

```
## 0% 25% 50% 75% 100%
## 1.00 1.46 2.04 2.78 4.00
```

which gives the same result as:

`quantile(nc_score, probs = c(0, 0.25, 0.5, 0.75, 1))`

```
## 0% 25% 50% 75% 100%
## 1.00 1.46 2.04 2.78 4.00
```

Note:Not all statisticians define quartiles in exactly the same way. For a detailed discussion of the different methods for computing quartiles, see the online article “Quartiles in Elementary Statistics” by E. Langford (2006). In addition, you may find the`help(quantile)`

function and its`type`

argument helpful.

In order to calculate the \(IQR\)
for the `nc_score`

variable we either write…

```
nc_score_quart <- quantile(nc_score, names = FALSE)
nc_score_quart[4] - nc_score_quart[2]
```

`## [1] 1.32`

…or we apply the in-built function `IQR()`

:

`IQR(nc_score)`

`## [1] 1.32`

We can visualize the partitioning of the `nc_score`

variable into quartiles by plotting a histogram and by adding a couple
of additional lines of code.

```
h <- hist(nc_score, breaks = 50, plot = F)
cuts <- cut(h$breaks, c(0, nc_score_quart))
plot(h,
col = rep(c("4", "4", "3", "2", "1"))[cuts],
main = "Quartiles",
xlab = "Numerus Clausus score"
)
# add legend
legend("topright",
legend = c("1st", "2nd", "3rd", "4th"),
col = c(4, 3, 2, 1),
pch = 15
)
```

**Citation**

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: *Hartmann,
K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis
using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.*