The binomial distribution is the probability distribution for the number of successes in a sequence of Bernoulli trials (Weiss 2010).
In \(n\) Bernoulli trials, the number of outcomes that contain exactly \(x\) successes equals the binomial coefficient \({n \choose x}\) (Weiss 2010).
Let \(X\) denote the total number of successes in \(n\) Bernoulli trials with success probability \(p\). Then the probability distribution of the random variable \(X\) is given by:
\[ P(X = x) = {n \choose x}p^x(1 - p)^{n-x}, \qquad x = 0, 1, 2, . . . , n\]
The random variable \(X\) is called a binomial random variable and is said to have the binomial distribution with parameters \(n\) and \(p\) (Weiss 2010).
Let us consider an example taken form the real world.
Long term statistics tell us, that the chance for passing the final statistics exam is 0.3. Yes, approximately 70% of students fail in their statistic exam! By the way, if you complete this e-learning module your chances of passing the final exam will definitely increase. :-)
Let us consider a class of 25 students. What is the chance that exactly 3 students of that particular class will pass the final statistics exam? Or written more formally \(P(X = 3)\). Once again we start by implementing a naive approach in R.
n <- 25 # number of students
p <- 0.3 # probability of success
k <- 3 # exactly 3 students will pass the exam
choose(n,k)*p^k*(1-p)^(n-k)
## [1] 0.02427999
Wow, the probability that exactly 3 out of 25 students \((P(X = 3))\) will pass the final statistics exam is quite low. What about the probability that exactly 15 out of 25 students \((P(X = 15))\) will pass the final statistics exam. We turn to R.
n <- 25 # number of students
p <- 0.3 # probability of success
k <- 15 # exactly 15 students will pass the exam
choose(n,k)*p^k*(1-p)^(n-k)
## [1] 0.001324897
The probability of \(P(X = 15)\) is approximately 0.1%. We can continue with our experiments to find out all the probabilities of exactly one outcome for \(k = 0,1,2,...,n\). Please note that for our particular example it is not very informative to know the probability to pass the exam of exactly one particular number (\(k\)) of students. For us it is of higher informational interest if we could answer the question what is the probability that \(k\) or less students \((P(X \le k))\) will pass the exam, or equally interesting that \(k\) or more students \((P(X \ge k))\) will pass the exam.
As an exercise we turn to R and determine what is the probability that 9 or less students will pass the final statistics exam \((P(X \le 9))\). So we are interested in the probability that 0 out of 25, or 1 out of 25, or 2 out of 25, …, or 9 out of 25 students will pass the exam. To calculate this probability we can extend our naive approach and calculate \(P(X = 0)+P(X = 1)+P(X = 2)+...+P(X = 9)\).
n <- 25 # number of students
p <- 0.3 # probability of success
choose(n,0)*p^0*(1-p)^(n-0)+
choose(n,1)*p^1*(1-p)^(n-1)+
choose(n,2)*p^2*(1-p)^(n-2)+
choose(n,3)*p^3*(1-p)^(n-3)+
choose(n,4)*p^4*(1-p)^(n-4)+
choose(n,5)*p^5*(1-p)^(n-5)+
choose(n,6)*p^6*(1-p)^(n-6)+
choose(n,7)*p^7*(1-p)^(n-7)+
choose(n,8)*p^8*(1-p)^(n-8)+
choose(n,9)*p^9*(1-p)^(n-9)
## [1] 0.810564
The probability that 9 or less students will pass the statistics exam \((P(X\le 9))\) is 81.1%. In turn the means that the probability that 10 or more students will pass the exam is \(P(X > 9)= 1-P(X \le 9)\) or only 18.9%.
Beside the honestly quite unpleasant insight in the statistics of passing the final exam, it is clear that the naive implementation from above is quite tedious. Therefore we introduce the in-built function dbinom()
and pbinom()
. Call the help()
on those functions for further information.
According to the help()
function the usage of dbinom()
is as follows: dbinom(x, size, prob, log = FALSE)
. We disregard the log
argument and stay with its default log = FALSE
. Thus, the dbinom()
function simplifies to dbinom(x, size, prob)
, where x
corresponds to our \(k\), size
to our \(n\) and prob
to our \(p\). In order to check if the dbinom()
function works as expected, we recall our test examples from above, which was \(P(X = 3)\) and \(P(X = 15)\).
n <- 25 # number of students
p <- 0.3 # probability of success
k <- c(3,15)
dbinom(k[1], n, p)
## [1] 0.02427999
dbinom(k[2], n, p)
## [1] 0.001324897
Compare these results with the results of our naive implementation (scroll up). The numbers should match.
The dbinom()
function becomes very handy if we consider to accumulate probabilities, such as \(P(X \le x)\) or \(P(X > x)\), because we can specify a vector for the argument x
, and sum it up afterwards. Let us consider the example from above \((P(X \le 9))\).
n <- 25 # number of students
p <- 0.3 # probability of success
k <- 1:9
dbinom(k, n, p)
## [1] 0.001436859 0.007389562 0.024279989 0.057231402 0.103016524 0.147166462
## [7] 0.171193640 0.165079581 0.133635851
sum(dbinom(k, n, p))
## [1] 0.8104299
Another in-built function that is appropriate to answer \(P(X \le 9)\) is the pbinom
function, which returns the distribution function. The pbinom
function calculates cumulative probability distribution and is thus handy, because we can omit to call the sum
function. The usage of pbinom()
is as follows: pbinom(x, size, prob, lower.tail = TRUE, log.p = FALSE)
. Again, we disregard the log
argument and stay with its default log = FALSE
. In addition to the arguments from the dbinom()
function (see above) a new argument is available: lower.tail
. The default is set to TRUE
. If lower.tail = TRUE
the probabilities are \(P(X \le x)\), otherwise, \(P(X > x)\).
To make it clear let us recalculate \(P(X \le 9)\). The result for the calculation above was 0.8104299. Now we apply the pbinom()
function to get the same result.
n <- 25 # number of students
p <- 0.3 # probability of success
k <- 9
pbinom(k,n,p, lower.tail = TRUE)
## [1] 0.810564
In order the get \(P(X > 9)\), which is the same as \(1-P(X \le 9)\), we set the argument lower.tail = FALSE
.
n <- 25 # number of students
p <- 0.3 # probability of failure
k <- 9
pbinom(k,n,p, lower.tail = FALSE)
## [1] 0.189436
Finally, in order to visualize the binomial probability distribution we apply the rbinom()
function, which generates random deviates for a binomial distribution defined by its size \(n\) and probability of success/failure \(p\).
size <- 25 # size
p <- 0.3 # probability of success
n <- 100000 # number of random samples
set.seed(3) # set seed for reproducibility
random.binom.numbers <- rbinom(n, size, p)
h <- hist(random.binom.numbers,
breaks = length(unique(random.binom.numbers)),
plot = FALSE)
plot(h,
freq = FALSE,
space = NULL,
xlim = c(0,size),
xlab = 'Students passing the final exam', # label for x-axis
ylab = "Probability", # label for y-axis
main = "Binomial Probability Distribution \nfor size=25 and p=0.3", # add title
col = "#fc9d12", # add color "#fc9d12"
xaxt = "n") # do not show ticks of the x-axis
axis(side=1, at=seq(0.5,size, by=1), labels=seq(1,size, by=1)) # set ticks at defined positions for x axis
In addition we visualize the discussed probabilities form above: \(P(X = 3)\), \(P(X = 15)\), \(P(X \le 9)\), \(P(X > 9)\) (upper panel), and the corresponding cummulative binomial probability distribution function (lower panel).
In order to conclude this section, and in order to give you some intuition of the shapes of different binomial probability distributions, three different binomial probability and its corresponding cumulative binomial provability distributions for \(p=0.1\), \(p=0.5\), \(p=0.9\) are given below.