In statistics, the **mode** represents the most common value in a data set.
Therefore, the mode is the value that occurs with the highest frequency in a data set (Mann 2012).
In terms of graphical frequency distribution the mode corresponds to the summit(s) of the graph.

A major shortcoming of the mode is that a data set may have none or may have more than one mode, whereas it will have only one mean and only one median.
For instance, a data set with each value occurring only once has no mode.
A data set with only one value occurring with the highest frequency has one mode.
The data set is called **unimodal** in this case.
A data set with two values that occur most frequently has two modes. The distribution, in this case, is said to be **bimodal**. If more than two values in a data set occur most frequently, then the data set contains more than two modes and it is said to be **multimodal** (Mann 2012).

In [2]:

```
# First, let's import all the needed libraries.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import random
import statistics as stats
```

In [3]:

```
# Create an unimodal, a bimodal and multimodal distribution
def modalDistFunc(m, n, mu, sig):
"""FUNCTION modalDistFunc ###
m int: number of modes
n int: number of random variables
mu vector: mu, mean values
sig vector: standard deviations
"""
out = np.array([0])
for idx in list(range(0, m, 1)):
dist = np.random.normal(mu[idx], sig[idx], n)
out = np.concatenate([out, dist])
return out
```

In [6]:

```
### generate plot ###
fig, axs = plt.subplots(1, 3, figsize=(15, 5))
## arrow style ##
col = "red"
wd = 0.05
hln = 0.02
## uni
m = 1
axs[0].hist(modalDistFunc(m, n, mu, sig), bins=breaks, density=1, color="black")
axs[0].title.set_text("unimodal")
axs[0].arrow(
mu[0],
max(h_uni[1]) * 0.11,
mu[0],
max(h_uni[1]) * -0.01,
width=wd,
color=col,
head_length=hln,
head_width=6 * wd,
)
axs[0].axis("off")
axs[0].text(0.7, 0.4, "one mode", style="italic", color="red")
# bimodal
m = 2
axs[1].hist(modalDistFunc(m, n, mu, sig), bins=breaks, density=1, color="black")
axs[1].title.set_text("bimodal")
axs[1].arrow(
mu[0],
0.26,
0,
-0.045,
length_includes_head=True,
width=wd,
color=col,
head_length=hln,
head_width=6 * wd,
)
axs[1].arrow(
mu[1],
0.2,
0,
-0.045,
length_includes_head=True,
width=wd,
color=col,
head_length=hln,
head_width=6 * wd,
)
axs[1].axis("off")
axs[1].text(6, 0.2, "two modes", style="italic", color="red")
# multimodal
m = 4
axs[2].hist(modalDistFunc(m, n, mu, sig), bins=breaks, density=1, color="black")
axs[2].title.set_text("multimodal")
axs[2].arrow(
mu[0],
0.15,
0,
-0.025,
length_includes_head=True,
width=wd,
color=col,
head_length=0.015,
head_width=16 * wd,
)
axs[2].arrow(
mu[1],
0.11,
0,
-0.025,
length_includes_head=True,
width=wd,
color=col,
head_length=0.015,
head_width=16 * wd,
)
axs[2].arrow(
mu[2],
0.07,
0,
-0.025,
length_includes_head=True,
width=wd,
color=col,
head_length=0.015,
head_width=16 * wd,
)
axs[2].arrow(
mu[3],
0.07,
0,
-0.025,
length_includes_head=True,
width=wd,
color=col,
head_length=0.015,
head_width=16 * wd,
)
axs[2].axis("off")
axs[2].text(6, 0.1, "more than two modes", style="italic", color="red")
plt.show()
```

Unlike the mean and the median, the mode can be applied to quantitative (numeric) and qualitative (categorical) data.
The Python library `statistics`

provides the `mode`

function to calculate the mode.This function takes a vector as input and gives the mode value as output.

You can now test this function on the `students`

data set.
Use the very handy `apply`

function in order to apply the function `mode`

to every variable of interest.
If you struggle with the `apply`

function you may type `help(apply)`

or look up the documentation with examples here.

In [7]:

```
students = pd.read_csv(
"https://userpage.fu-berlin.de/soga/200/2010_data_sets/students.csv"
)
vars = ["gender", "age", "religion", "nc.score", "semester", "height", "weight"]
students[vars].apply(stats.mode)
```

Out[7]:

gender Male age 21 religion Catholic nc.score 1.18 semester 1st height 174 weight 67.1 dtype: object

**Citation**

The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: *Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis
using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.*