20354_The_Hypergeometric_Distribution_additional

Exercise: Calculate the probability to win based on the Eurojackpot for all combinations (Numbers and Euronumbers)! Provide also the expected earning based on the divident payout from the drawing at 20th of September 2019.

Hint: You can find the relevant information under https://www.eurojackpot.de/.

### your code here

Show code

x <- 1:5
m <- 5
n <- 45
k <- 5

outcome <- c("2plus1E","1plus2E","3","3Plus1E","2Plus2E","3Plus2E","4","4plus1E","4plus2E","5","5plus1E","5plus2E")
x <- c(2,1,3,3,2,3,4,4,4,5,5,5)
m <- rep(5, 12)
n <- rep(45, 12)
k <- rep(5, 12)

x_euro <- c(1,2,0,1,2,2,0,1,2,0,1,2)
m_euro <- rep(2, 12)
n_euro <- rep(8, 12)
k_euro <- rep(2, 12)

euro_jackpot <- data.frame(outcome, x, m, n, k, x_euro, m_euro, n_euro, k_euro)

for (i in 1:length(euro_jackpot$outcome)) {
  euro_jackpot[i, 10] <- dhyper(x = euro_jackpot[i, 2], 
                                m = euro_jackpot[i, 3], 
                                n = euro_jackpot[i, 4], 
                                k = euro_jackpot[i, 5])
  if (euro_jackpot[i, 6] > 0) {
    euro_jackpot[i, 10] <- euro_jackpot[i, 10] * dhyper(x = euro_jackpot[i, 6], 
                                                        m = euro_jackpot[i, 7], 
                                                        n = euro_jackpot[i, 8], 
                                                        k = euro_jackpot[i, 9])
  }
}
colnames(euro_jackpot)[10] <- "probability"
euro_jackpot["probability_in_percent"] <- round(euro_jackpot[, 10], digits = 7) * 100
euro_jackpot["quotes"] <- c(8.50, 8.50, 17.70, 20.00, 20.00, 60.10, 139.90, 318.30, 5490.20, 144943.20, 342227.10, 44648075.80)
euro_jackpot[, c(1,10,11)]

##    outcome  probability probability_in_percent
## 1  2plus1E 2.381267e-02                2.38127
## 2  1plus2E 7.813532e-03                0.78135
## 3        3 4.672544e-03                0.46725
## 4  3Plus1E 1.661349e-03                0.16613
## 5  2Plus2E 1.488292e-03                0.14883
## 6  3Plus2E 1.038343e-04                0.01038
## 7        4 1.061942e-04                0.01062
## 8  4plus1E 3.775793e-05                0.00378
## 9  4plus2E 2.359871e-06                0.00024
## 10       5 4.719742e-07                0.00005
## 11 5plus1E 1.678130e-07                0.00002
## 12 5plus2E 1.048831e-08                0.00000

expected_earning <- euro_jackpot$probability * euro_jackpot$quotes
paste("Partizipate in Eurojackpot: For 1.50 Euro pool, we may expect",round(sum(expected_earning),2),"€ outcome")

## [1] "Partizipate in Eurojackpot: For 1.50 Euro pool, we may expect 1.05 € outcome"

Exercise: An exercise catalog for the final exam in geostatistics contain N exercises. 20% of these N exercises will be tested within the exam. 50% of the questions have to be answered right to pass the exam. A student prepares m exercises and memorized them in-depth.
a) Assume: N={100,200,400,600}. How likely will the student pass the exam, if he/she prepares N/2 exercises (50% of the catalog)? Remember: at least 50% of the question have to be answered right for a successful examination.

### your code here

Show code

N<-c(100,200,400,600)
m<-N/2
n<-m
k<-0.2*N
x<-k/2
ans22<-NULL
for (i in 1:4) {
  ans22[i]<-phyper(q = x[i]-1,m = m[i],n = n[i],k =   k[i],lower.tail = F)
}
paste0("The probabilities are as follow:",
       " N=", N[1]," : ",round(ans22[1]*100,1),"%; ",
       " N=", N[2]," : ",round(ans22[2]*100,1),"%; ",
       " N=", N[3]," : ",round(ans22[3]*100,1),"%; ",
       " N=", N[4]," : ",round(ans22[4]*100,1),"%")

## [1] "The probabilities are as follow: N=100 : 59.8%;  N=200 : 57%;  N=400 : 55%;  N=600 : 54.1%"

Assume N=450: How many exercise should the student prepare for having a chance not less than 50% passing the exam?

### your code here

Show code

N <- 450     # total number of exercises
k <- 0.2 * N # number exercises in the exam
q <- 0.5 * k # half of k --> is needed to pass the exam  

m <- seq(100, 450) # What is m?

for (i in m) {
  result = phyper(q-1,  i,  N - i, k,lower.tail = F)
  if (result > 0.5){
    print(paste("The student should prepare at least prepare",i,"exercises to pass the exam with", round(result,3)*100, "% probability!"))
    break # stop loop if result > 0.5
  }
}

## [1] "The student should prepare at least prepare 223 exercises to pass the exam with 50.9 % probability!"

Exercise: During an online shopping session 50 product links have been collected in the browser cache. Six out of 50 had been definitly of users interest. Unfortunatly, the user forgot to clear the cache after his/her excessive session. Next day he/she send a request to an ad-fundet news site with 8 ad-spaces which are randomly filled with links out of the browser cache. Calculate the probability, that at least 4 ad-spaces will be filled to site-ads, which are related to the six interesting sites of the last day shopping adventure.

### your code here

Show code

x<-0:6
m<-6
n<-44
k<-8
p<-matrix(1,1,7)
for (i in x) {
  p[i+1]<-dhyper(x = i,m = m,n = n,k = k)
}
barplot(p,names.arg = x,
        xlab="Strikes",ylab="P(X=x|6,44,8)",
        main="Hypergeometric density Browsercache P(X=x|6,44,8)=p")

paste("The probability for the case that at least 4 advertisements are placed successfull is ",
      round(phyper(3,m,n,k,lower.tail = F)*100,2),"%.", sep = "")

## [1] "The probability for the case that at least 4 advertisements are placed successfull is 0.39%."

How many ad-space are necessary to place at least one interesting ad-link with 95% probability?

### your code here

Show code

pl<-matrix(1,1,30)
for (k in 1:30) {
  pl[k]<-1-phyper(q = 0,m = 6,n = 50-m,k = k)
}

barplot(height = pl,names.arg = 1:30,
        xlab="k ad spaces",ylab="P(X>0|6,44,k)",
        main="For how many ad-spaces is P(at least 1 hit) >95% ? ",
        ylim=c(0,1))
abline(h = 0.95,col="red",lwd=2)

ans<-length(x = pl[pl<0.95])+1
paste("At least" ,ans ,"ad-space are needed to place at least one advertisement of interest with 95% probability.")

## [1] "At least 19 ad-space are needed to place at least one advertisement of interest with 95% probability."

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.