Bayesian Learning#

Bayesian Learning#

Exercise 4.1 (Reservation Wage)#

Prove that

\[\begin{align*} \int_{w_r}^{\infty}(w-w_r)dH(w)=\int_{w_r}^{\infty}[1-H(w)]dw. \end{align*}\]
Solution

We start by simplifying the left-hand side. Let \(F(w) = w - w_r\), the integration by parts formula implies that

\[\begin{align*} \int_{w_r}^{\infty} (w - w_r) dH(w) &= \int_{w_r}^{\infty} F(w) dH(w)\\ &=\left. F(w) H(w) \right|_{w_r}^{\infty} - \int_{w_r}^{\infty} H(w) dF(w)\\ &=\left. (w - w_r)H(w) \right|_{w_r}^{\infty} - \int_{w_r}^{\infty} H(w) dw. \end{align*}\]

Since \(\left. (w - w_r)H(w) \right|_{w_r}^{\infty}=\int_{w_r}^{\infty} dw\), the expression above is indeed equal to \(\int_{w_r}^{\infty} [1-H(w)] dw\).

Exercise 4.2 (Normal-Normal conjugacy)#

Let \(y\) be a random variable that is drawn from a normal distribution with mean \(\mu\) and standard deviation \(\sigma\).

  1. Consider a dataset \(Y=(y_1,y_2,...,y_n)\) containing \(n\) realizations of \(y\). Write down the likelihood of \(Y\) as a function of \(\mu\) and \(\sigma\)

  2. Assume that \(\mu\) itself is Normally distributed around some mean \(\theta\) with standard deviation \(\tau\). Show that upon observing \(Y\), the posterior belief about \(\mu\) remains normally distributed. Derive the posterior mean and standard deviation of \(\mu\). What do you notice about the standard deviation? Comment your finding.

Solution
  1. The likelihood of \(Y\) can be written as

\[\begin{align*} L\left(\mu, \sigma | Y \right) &= \prod_{i = 1}^{n} f\left(y_{i}|\mu, \sigma\right) \\ &= \prod_{i = 1}^{n} \frac{1}{\sqrt{2\pi \sigma^{2}}}\text{exp}\left(-\frac{\left(y_{i}-\mu\right)^{2}}{2\sigma^{2}}\right). \end{align*}\]
  1. By Bayes’ rule, we find the posterior distribution of \(\mu\) given the data \(Y\):

\[\begin{align*} p\left(\mu|Y,\theta,\tau,\sigma\right) &\propto p\left(Y|\mu,\sigma\right)p\left(\mu|\theta,\tau\right)\\ & \propto \text{exp}\left(-\frac{1}{2}\left[\frac{1}{\sigma^{2}}\sum_{i=1}^{n}\left(y_{i}-\mu\right)^{2} + \frac{\left(\mu-\theta\right)^{2}}{\tau^{2}}\right]\right). \end{align*}\]

Let \(\overline{y} = \frac{1}{n}\sum_{i=1}^{n}y_{i}\), by the Law of Total Variance, we have

\[\begin{equation*} \sum_{i=1}^{n}\left(y_{i}-\mu\right)^{2} = \sum_{i=1}^{n}\left(y_{i} - \overline{y}\right)^{2} + n\left(\overline{y}-\mu\right)^{2}. \end{equation*}\]

Thus the posterior simplifies to:

\[\begin{equation*} p\left(\mu|Y,\theta,\tau,\sigma\right) \propto \text{exp}\left(-\frac{1}{2}\left[\frac{n\left(\overline{y}-\mu\right)^{2}}{\sigma^{2}} + \frac{\left(\mu-\theta\right)^{2}}{\tau^2}\right]\right). \end{equation*}\]

The exponent can be written as:

\[\begin{equation*} \left(\mu - \frac{\frac{\overline{y}}{\sigma^{2}} + \frac{\theta}{\tau^{2}}}{\frac{n}{\sigma^{2}} + \frac{1}{\tau^{2}}}\right)^{2}*\left(\frac{n}{\sigma^{2}} + \frac{1}{\tau^{2}}\right). \end{equation*}\]

From this, we can conclude that the posterior mean \(\mu_{post}\) and the posterior variance \(\sigma^{2}_{post}\) read

\[\begin{align*} & \mu_{post} = \frac{\frac{\overline{y}}{\sigma^{2}} + \frac{\theta}{\tau^{2}}}{\frac{n}{\sigma^{2}} + \frac{1}{\tau^{2}}}, \\ & \sigma^{2}_{post} = \frac{1}{\frac{n}{\sigma^{2}} + \frac{1}{\tau^{2}}}. \end{align*}\]

Note that the posterior variance is smaller than the prior variance, reflecting the fact that, after observing the data, we have more precise information about \(\mu\).

Exercise 4.3 (Bernouilli)#

Let \(y\) be a random variable that is drawn from a Bernouilli distribution with probability of success \(p\).

  1. Consider a dataset \(Y=(y_1,y_2,...,y_n)\) containing \(n\) realizations of \(y\). Write down the likelihood of \(Y\) as a function of \(p\).

  2. Assume that \(p\) itself is drawn from the following distribution

\[\begin{equation*} p=\left\{ \begin{array}{l} p_{h}\text{ with probability } \mu _{0}, \\ p_{l}\text{ with probability } 1-\mu _{0}. \end{array}% \right. \end{equation*}\]

The variable \(\mu_0\) denotes the prior in period \(0\). Use Bayes rule to express the agent’s posterior \(μ_1(0)\) at the beginning of period \(2\) after having observed a failure in period \(1\), i.e. \(y_1=0\). Iterate the computation to derive the posterior after \(n\) failures in a row.

Solution
  1. The likelihood of \(Y\) reads

\[\begin{equation*} L\left(p | Y \right) = \prod_{i = 1}^{n} p^{y_{i}}(1-p)^{1-y_{i}}. \end{equation*}\]
  1. By Baye’s rule, we can express \(μ_1(0)\) as:

\[\begin{align*} μ_1(0) = p(p_{h}|y_{1}=0) & = \frac{p(y_{1}=0|p_{h})\mu_{0}}{p(y_{1}=0|p_{h})\mu_{0} + p(y_{1}=0|p_{l})(1-\mu_{0})}\\ & = \frac{(1-p_{h})\mu_{0}}{(1-p_{h})\mu_{0} + (1-p_{l})(1-\mu_{0})}. \end{align*}\]

The posterior after \(n\) failures could be expressed as:

\[\begin{align*} μ_n(0^n) = p(p_{h}|y_{1}=0, y_{2}=0,...,y_{n}=0) & = \frac{p(y_{1}=0, y_{2}=0,...,y_{n}=0|p_{h})\mu_{0}}{p(y_{1}=0, y_{2}=0,...,y_{n}=0|p_{h})\mu_{0} + p(y_{1}=0, y_{2}=0,...,y_{n}=0|p_{l})(1-\mu_{0})}\\ & = \frac{(1-p_{h})^{n}\mu_{0}}{(1-p_{h})^{n}\mu_{0} + (1-p_{l})^{n}(1-\mu_{0})}. \end{align*}\]

Note that \(μ_n\) can be expressed recursively

\[\begin{align*} μ_n(0) = \frac{(1-p_{h})\mu_{n-1}}{(1-p_{h})^{n}\mu_{n-1} + (1-p_{l})^{n}(1-\mu_{n-1})}. \end{align*}\]