Import packages.
library("ggplot2")
library("reshape2")
Suppose we have a coin landing heads-up with the probabilty \(\theta\). We flipped it 10 times and got 8 heads and 2 tails. We’d like to have a function to find the value of \(\hat{\theta}\) that is most likely to give us the data \(D=\){8 heads out of 10 flips}. That is, we wanna create a likelihood function of \(\theta\) given \(D\). The likelihoodn funcion can be writte as \[L(\theta | D) = \mathrm{Pr}(D | \theta).\] The right-hand of the equation is the PMF of the binomial distribution with \(n = 10\) trials and the prbablity of success \(\theta\). Thus, \[L(\theta |D) = \binom{10}{8} \theta^8 (1-\theta)^{10-2} = 45\theta^8 (1-\theta)^2.\] This is a function showing how likely each value of \(\theta\) is as probability that generates the data \(D\).
Let’s define this funciton in R.
lik_1 <- function(theta) {
## Argument: the probability of getting head by a coin flip
## Return: likelihood of theta given that we got 8 heads out of 10 flips
L <- 45 * theta^8 * (1 - theta)^2
return(L)
}
This funciton allows us to calculate the likelihood of a spccific value of \(\theta\) given the observe data. For instance, the likelihoods for \(\theta=0, 0.2, 0.7, 1\) are
lik_1(0); lik_1(.2); lik_1(.7); lik_1(1)
## [1] 0
## [1] 7.3728e-05
## [1] 0.2334744
## [1] 0
To understand the likelihood function better, let’s draw a graph of the function. Because \(\theta\) is probability, \(\theta \in [0,1]\).
curve(lik_1, from = 0, to = 1, lwd = 2, col = "firebrick",
xlab = expression(theta), ylab = expression(L(theta)),
main = expression(paste("Likelihood function of ", theta)))
If you’d like a little prettier one by ggplot2:
v_theta <- seq(0, 1, length = 1000)
lik_theta <- lik_1(v_theta)
df <- data.frame(theta = v_theta,
likelihood = lik_theta)
p_lik <- ggplot(df, aes(theta, likelihood)) +
geom_line(color = "tomato") +
scale_x_continuous(breaks = seq(0, 1, by = .2)) +
labs(x = expression(theta), y = expression(L(theta)),
title = expression(paste("Likelihood function of ", theta)))
print(p_lik)
Suppose we have a coin landing heads-up with the probabilty \(\theta\). We flipped it 10 times and obtained the result \[D = \{D_1, D_2, \dots, D_{10}\} = \{H, H, T, H, H, H, H, H, H, T \}.\] With \(H = 1\) and \(T = 0\), \[D = \{1,1,0,1,1,1,1,1,1,0\}.\] Assuming that Bernoulli trials are identically and independently distributed (iid), the likelihood function of \(\theta\) for the observed data \(D\) is \[L(\theta | D) = \mathrm{Pr}(D | \theta) = \prod_{i=1}^{10} Pr(D_i | \theta) = \prod_{i=1}^{10} L_i(\theta | D_i).\]
\(D_i\) is either 0 or 1. For \(D_i=1\), \[Pr(D_i =1 | \theta) = \theta,\] and for \(D_i = 0\), \[Pr(D_i =0 | \theta) = 1-\theta.\]
Therefore, \[L(\theta | D) = \theta^8 (1-\theta)^2.\]
Let’s draw a graph of this likelihood function.
lik_2 <- function(theta) {
## Argument: theta = probability of head
## Return: L = likelihood of theta given D
L <- theta^8 * (1 - theta)^2
return(L)
}
curve(lik_2, from = 0, to = 1, lwd = 2, col = "darkblue",
xlab = expression(theta), ylab = expression(L(theta)),
main = expression(paste("Likelihood function of ", theta)))
Let’s visualize the likelihood and log-likelihood functions.
v_theta <- seq(0.01, 0.99, length = 100)
L <- lik_2(v_theta)
lL <- log(L)
df <- data.frame(theta = v_theta,
Likelihood = L,
log_Likelihood = lL)
df <- melt(df, id = "theta")
p_lik_2 <- ggplot(df, aes(theta, value)) +
geom_line(color = "royalblue") +
scale_x_continuous(breaks = seq(0, 1, by = .2)) +
facet_grid(variable ~ ., scales = "free_y") +
geom_vline(xintercept = 0.8, color = "tomato") +
labs(x = expression(theta), title = "Likelihood and Log-Likelihood")
print(p_lik_2)