The R-function family for the hypergeometric distribution is using a practical interpretation of the parameter N, n, M, k:
E.g., the function dhyper(x,m,n,k)
(phyper, qhyper, and
rhyper accordingly) requests x (=k),m (=M), n (=N-M), k (=n): \[ {\displaystyle p(x)=\ P(X=x)={\frac {{\binom
{m}{x}}{\binom {n}{k-x}}}{\binom {m+n}{k}}}}\]
R interprets the formula in the following way: An urn contains m white (success) and n black balls. The number of balls drawn from the urn is k. The number x represents the number of white balls drawn from the urn without replacement.
For the lotto 6 out of 49 we may calculate the probability that 3 numbers are successfully wagered:
x<-3
m<-6
n<-43
k<-6
result<-dhyper(x,m,n,k)
paste("The probability that exactly x=", x, "numbers are wagered right is", round(result,4)*100, "%")
## [1] "The probability that exactly x= 3 numbers are wagered right is 1.77 %"
Furthermore, we may calculate the probability for each successful outcome x=2:6 plus supplementary number (“Superzahl”) out of 10:
x<-3:6
m<-6
n<-43
k<-6
outcome<-c("2plusS","3","3PlusS","4","4plusS","5","5plusS","6","6plusS")
success<-c(dhyper(2,m,n,k)*0.1,dhyper(x[1],m,n,k),dhyper(x[1],m,n,k)*0.1,
dhyper(x[2],m,n,k),dhyper(x[2],m,n,k)*0.1,
dhyper(x[3],m,n,k),dhyper(x[3],m,n,k)*0.1,
dhyper(x[4],m,n,k),dhyper(x[4],m,n,k)*0.1)
paste(outcome, ": ", round(success, digits = 6) * 100, "%", sep = "")
## [1] "2plusS: 1.3238%" "3: 1.765%" "3PlusS: 0.1765%" "4: 0.0969%"
## [5] "4plusS: 0.0097%" "5: 0.0018%" "5plusS: 2e-04%" "6: 0%"
## [9] "6plusS: 0%"
barplot(height = success,
xlab="outcome",
ylab="P(2>X>6|6,43,6)",
names.arg=c("2plusS","3","3PlusS","4","4plusS","5","5plusS","6","6plusS"),
main='Probability for each successful outcome x=2:6 (plus "Superzahl")',
col = '#00CCCC')
Now, let us prove how fair the game is: What is the expected earning?
quotes<-c(5,9.1,19,32.8,151.3,2056,6572.3,381195.5,8000000)
exp_euro<-success*quotes
paste("For 1 Euro pool, we may expect",round(sum(exp_euro),2),"EUR outcome")
## [1] "For 1 Euro pool, we may expect 0.44 EUR outcome"
Motivation: By presenting the outcomes of the latter exercises you can try to convince your parents and/or grandparents to stop gambling and better directly donate the gambling expensives onto your bank account! It would definitly be a better investment!!!!!!!
Exercise: For preparation for the final exam, a teacher provides a catalog of 30 exercises. In the final exam, the teacher has randomly choosen 10 exercises out of this catalog. A certain student had prepared 18 exercises from the catalog and memorized them in-depth. Calculate the probability that at least 50% of the exam exercises have been prepared by the student!
### your code here
m <- 18
n <- 12
k <- 10
result <- phyper(q = 4, m = m, n = n, k = k, lower.tail = F)
paste("The probability that the student can answer at least 50% of all questions is ",
round(result*100, digits = 2), "%.", sep = "")
## [1] "The probability that the student can answer at least 50% of all questions is 88.17%."
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.