The main functions regarding the $F$-distribution are f.rvs(), f.pdf(), f.cdf(), f.ppf() from the scipy.stats package. The f.pdf() function gives the density, the f.cdf() function gives the distribution function, the f.ppf() function gives the quantile function, which is the inverse of cdf-percentiles, and the f.rvs() function generates random deviates.
We use the f.pdf(x, dfn, dfd) to calculate the density at the value of 1.2 of a $F$-curve with $dfn=10$ and $dfd=20$.
# First, let's import all the needed libraries.
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
stats.f.pdf(1.2, dfn=10, dfd=20)
0.5626124566227062
First, we use the pdf() to calculate the area under the curve for the interval $[0,1.5]$ and the interval $[1.5, +\infty)$ of a $F$-curve with with $dfn=10$ and $dfd=20$. Further we ask Python if the sum of the intervals $[0,1.5]$ and $[1.5, +\infty)$ sums up to 1:
x = 1.5
dfn = 10
dfd = 20
stats.f.pdf(x, dfn, dfd) ## lower tail of the distribution --> [0,1.5]
0.3581610916591196
1 - stats.f.pdf(x, dfn, dfd) ## upper tail of the distribution --> [1.5, + infinity]
0.6418389083408804
(1 - stats.f.pdf(x, dfn, dfd)) + stats.f.pdf(x, dfn, dfd) == 1
True
We use the f.pdf() to calculate the quantile for a given area (= probability) under the curve for a $F$-curve with $dfn=10$ and $dfd=20$ that corresponds to $q = 0.25, 0.5, 0.75$ and $0.999$. This time, we do not set 1 - f.pdf() in order the get the area for the interval $[0, q]$, which is the lower tail of the distribution.
q = [0.25, 0.5, 0.75, 0.999]
dfn = 10
dfd = 20
stats.f.pdf(q[0], dfn, dfd)
0.20881240583589708
stats.f.pdf(q[1], dfn, dfd)
0.6878819621273636
stats.f.pdf(q[2], dfn, dfd)
0.8336594552286112
stats.f.pdf(q[3], dfn, dfd)
0.7150707286950534
We use f.rvs()function to generate 100,000 random values from the $F$-distribution with $v_1=10$ and $v_2=20$. Thereafter we plot a histogram and compare it to the probability density function of the $F$-distribution with $v_1=10$ and $v_2=20$ (orange line).
rand_f_samples = stats.f.rvs(dfn=10, dfd=20, size=100000)
plt.figure(figsize=(10, 5))
plt.hist(
rand_f_samples,
density=True,
color="lightgrey",
edgecolor="darkgrey",
bins="scott",
)
plt.plot(
np.arange(0, 4, 0.1),
stats.f.pdf(np.arange(0, 4, 0.1), dfn=10, dfd=20),
"-",
linewidth=2,
color="orange",
)
plt.xlim(0, 4)
plt.title(
"Histogram for a $F$-distribution with $v_1 = 10$ and $v_2 = 20$ degrees of freedom (df)"
)
plt.show()
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.