I. Calculation of Factorial

The factorial of a non-negative integer \(n\) is denoted \(n!\), and \[ n! = n(n - 1)(n - 2)\cdots 1,\] where \(0!=1\).

To calculate factorials, I cretate an R function below other than using the built-in function.

r.fct <- function(n){ ## function to calculate factorials
  if(n == 0){ ## if n is 0, the formula of n = 1 is returned 
    return(n = 1)
  }
  else{ ## if n is not 0, the formula of n(n-1)(n-2)...1 is returned
    return(n * r.fct(n - 1 : 1))
  }
}

We are able to get the outcome by using the function above, for instance, the case of calculating \(0!\) is

r.fct(0)
## [1] 1

and \(5!\) is

r.fct(5)
## [1] 120


II. Calculation of Combination

This is a formula which denotes choosing k elements from n \[ {n \choose k} = \frac{n!}{k!(n-k)!}\]

By using the function to calculate factorial above, I create an R function calculating the number of possible combinations that can be obtained by taking a subset of elements from a larger set.

cmbntn <- function(n, k){ ## function to calculate combination
  com <- r.fct(n) / (r.fct(n - k) * r.fct(k))
  return(com)
}

Then, we can actually calculate combination, for instance, in the case of \({5 \choose 2}\), the result is as below:

cmbntn(5, 2)
## [1] 10

and in the case of \({20 \choose 18}\), the result is

cmbntn(20, 18)
## [1] 190


III. Calculation of Probability Mass of Binomial Distribution

The PMF of a random variable \(X\) can be written as \[f(x) = {n \choose x} p^{x} (1-p)^{(n-x)}~~~~~~~(x=0,1,2,\cdots, n)\] By denoting the binomial distribution as \(b(n, p)\), it is possible to say: \[X \sim b(n,p)\] where \(n\) is the number of independent bernoulli trials, and \(p\) is the probability of success for each trial.

By using the function I make at second question, I create a R function to calculate probability mass of binomial distributions as below.

Pmf <- function(x, n, p){ ## function to calculate PMF of binomial distribution
  ## arguments:
  ## x = number of successe for each trial
  ## n = number of bernoulli trials
  ## p = probability of succesful outocome
  binPmf <- r.fct(n) / (r.fct(n - x) * r.fct(x)) * p^x *(1 - p)^(n - x)
  return(binPmf)
}

This is an outocome as the number of success\((x)\) \(= 2\), the number of trials\((n)\) \(= 2\), and probability of succesful outcome\((p)\) \(= 0.2\).

Pmf(2, 2, 0.3)
## [1] 0.09

Moreover, in the case of \(x = 2\), \(n = 2\), and \(p = 0.6\), we get the result as below.

Pmf(2, 2, 0.6)
## [1] 0.36


IV. Displaying Graphically the PMF of Binomial Distribution

The case of p = 0.3

n = 2

To display graphically the PMF of binomial distributions for \(n = 2\), \(5\), \(10\), and \(50\), I write commands by using the function I create at third question.

First of all, we have to load the packages we will use.

library(ggplot2)
library(gridExtra)

Then, we determine the number of trials \((n)\) and probability of succesful outcome \((p)\),

n <- 2  ## setting the number of trials
p <- 0.3  ## setting the probability of succesful outcome

in addition, make a variable to save the results got from the simulations.

## making variables to save the results from the simulations
bin01 <- c(Pmf(0, n, p), rep(NA, n))  ## PMF of binomial distribution
trl <- 0:2  ## the number of trials

After that, we need preparing to use for loop to run a simulation.

for(i in 1:n){ ## loop for the simulation, i = 1, 2, ..., trials
   b <- Pmf(i, n, p)  ## assigning the results of calculating PMF to b
   bin01[i + 1] <- b  ## storing b(results) in bin01
}

Finally, we use ggplot to visualize the results.

## making a data frame to use ggplot
deta <- data.frame(bin01, trl)
## making a plot
ss50 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 2, p = 0.3")
print(ss50)

n = 5

We can use same commands written above to visualize the results, however need to change the number of trials and the probability of succesful outcome so that we get another results.

n <- 5
p <- 0.3
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:5
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss55 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 5, p = 0.3")
print(ss55)

n = 10

n <- 10
p <- 0.3
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:10
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss60 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 10, p = 0.3")
print(ss60)

n = 50

n <- 50
p <- 0.3
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:50
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss65 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 50, p = 0.3")
print(ss65)

The case of p = 0.6

n = 2

library(ggplot2)
n <- 2
p <- 0.3
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:2
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss70 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 2, p = 0.6")
print(ss70)

n = 5

n <- 5
p <- 0.6
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:5
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss75 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 5, p = 0.6")
print(ss75)

n = 10

n <- 10
p <- 0.5
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:10
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss80 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial") +
ggtitle("n = 10, p = 0.6")
print(ss80)

n = 50

n <- 50
p <- 0.6
bin01 <- c(Pmf(0, n, p), rep(NA, n))
trl <- 0:50
for(i in 1:n){
   b <- Pmf(i, n, p)
   bin01[i + 1] <- b
}
deta <- data.frame(bin01, trl)
ss85 <- ggplot(deta, aes(x = trl, y = bin01)) +
geom_bar(stat = "identity", aes(), colour = "black", fill = "white") +
ylab("PMF") +
xlab("trial")  +
ggtitle("n = 50, p = 0.6")
print(ss85)

To easily make sure the difference between each trial, we are able to combine these plots by using grid.arrange().

grid.arrange(ss50, ss55, ss60, ss65)

grid.arrange(ss70, ss75, ss80, ss85)



Back to Class Materials