The R-function family for the hypergeometric distribution is using a practical interpretation of the parameter N, n, M, k:

E.g., the function dhyper(x,m,n,k) (phyper, qhyper, and rhyper accordingly) requests x (=k),m (=M), n (=N-M), k (=n): \[ {\displaystyle p(x)=\ P(X=x)={\frac {{\binom {m}{x}}{\binom {n}{k-x}}}{\binom {m+n}{k}}}}\]

R interprets the formula in the following way: An urn contains m white (success) and n black balls. The number of balls drawn from the urn is k. The number x represents the number of white balls drawn from the urn without replacement.

For the lotto 6 out of 49 we may calculate the probability that 3 numbers are successfully wagered:

x<-3
m<-6
n<-43
k<-6
result<-dhyper(x,m,n,k)
paste("The probability that exactly x=", x, "numbers are wagered right is", round(result,4)*100, "%")
## [1] "The probability that exactly x= 3 numbers are wagered right is 1.77 %"

Furthermore, we may calculate the probability for each successful outcome x=2:6 plus supplementary number (“Superzahl”) out of 10:

x<-3:6
m<-6
n<-43
k<-6
outcome<-c("2plusS","3","3PlusS","4","4plusS","5","5plusS","6","6plusS")
success<-c(dhyper(2,m,n,k)*0.1,dhyper(x[1],m,n,k),dhyper(x[1],m,n,k)*0.1,
           dhyper(x[2],m,n,k),dhyper(x[2],m,n,k)*0.1,
           dhyper(x[3],m,n,k),dhyper(x[3],m,n,k)*0.1,
           dhyper(x[4],m,n,k),dhyper(x[4],m,n,k)*0.1)
paste(outcome, ": ", round(success, digits = 6) * 100, "%", sep = "")
## [1] "2plusS: 1.3238%" "3: 1.765%"       "3PlusS: 0.1765%" "4: 0.0969%"     
## [5] "4plusS: 0.0097%" "5: 0.0018%"      "5plusS: 2e-04%"  "6: 0%"          
## [9] "6plusS: 0%"
barplot(height = success, 
        xlab="outcome",
        ylab="P(2>X>6|6,43,6)",
        names.arg=c("2plusS","3","3PlusS","4","4plusS","5","5plusS","6","6plusS"),
        main='Probability for each successful outcome x=2:6 (plus "Superzahl")',
        col = '#00CCCC')

Now, let us prove how fair the game is: What is the expected earning?

quotes<-c(5,9.1,19,32.8,151.3,2056,6572.3,381195.5,8000000)
exp_euro<-success*quotes
paste("For 1 Euro pool, we may expect",round(sum(exp_euro),2),"EUR outcome")
## [1] "For 1 Euro pool, we may expect 0.44 EUR outcome"

Motivation: By presenting the outcomes of the latter exercises you can try to convince your parents and/or grandparents to stop gambling and better directly donate the gambling expensives onto your bank account! It would definitly be a better investment!!!!!!!

Exercise: For preparation for the final exam, a teacher provides a catalog of 30 exercises. In the final exam, the teacher has randomly choosen 10 exercises out of this catalog. A certain student had prepared 18 exercises from the catalog and memorized them in-depth. Calculate the probability that at least 50% of the exam exercises have been prepared by the student!

### your code here
Show code
m <- 18
n <- 12
k <- 10

result <- phyper(q = 4, m = m, n = n, k = k, lower.tail = F)
paste("The probability that the student can answer at least 50% of all questions is ", 
      round(result*100, digits = 2), "%.", sep = "")
## [1] "The probability that the student can answer at least 50% of all questions is 88.17%."

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.