```
%matplotlib inline
import triangle
import emcee
import matplotlib.pyplot as plt
import numpy as np
import seaborn
plt.rcParams['axes.labelsize'] = 22
```

# The relation between priors and the evidence¶

I wanted to understand a bit more about the effect of priors on so-called evidence. To be specific let's imagine we have a single data point $d=5$ and take a fixed likelihood function of a normal:

$$\mathcal{L}(d=5| \mu, \sigma) = N(5; \mu, \sigma) $$Now the *evidence* is given by marginalising the likelihood and prior ($\pi$) over the parameters $\mu$ and $\sigma$:

To simplify things, we will fix $\sigma=1$. Then all that remains is to specify the *prior* distribution on $\mu$.

## Uniform prior¶

The typical case is to set $\pi(\mu)$ to be uniform over some range $[a, b]$ e.g.

$$\pi(\mu) = \left\{ \begin{array}{cc} 1/(b-a) & \textrm{if } a < \mu < b \\ 0 & \textrm{otherwise} \end{array}\right. $$Inserting a uniform prior along with our normal likelihood (with $\sigma=1$) then calculating the evidence consists of evaluating:

$$ Z = \frac{1}{b-a}\int_{a}^{b} N(5; \mu, 1) d\mu $$Now provided the region $[a, b]$ contains most of the probability (e.g. the normal likelihood has decayed to near-zero at $a$ and $b$), then the integral is approximately unity. This leaves

$$ Z = \frac{1}{b-a} $$## Checking the result¶

Now we introduce a helper function `run`

with which we can calculate the posterior and evidence. We will be using emcee for the MCMC simulations and in particular the `ptsampler`

so that we can use thermodynamic integration to calculate the evidence. The function `run`

can take both uniform and normal priors which we will discuss later. For now ignore the following cell...

```
def run(prior_type, x1, x2, plot=True, betamin=-3, ntemps=30):
x = np.array([5.0])
if prior_type == "norm":
def logp(mu):
return -(mu - x1)**2 / (2*x2**2) - np.log(x2*np.sqrt(2*np.pi))
elif prior_type == "unif":
def logp(mu):
if x1 < mu and mu < x2:
return -np.log(x2 - x1)
else:
return -np.inf
def logl(params):
sigma = 1
mu = params
single = -(x-mu)**2 / (2*sigma**2) - np.log(sigma*np.sqrt(2*np.pi))
return np.sum(single)
nwalkers = 50
ndim = 1
betas = np.logspace(0, betamin, ntemps)
sampler = emcee.PTSampler(ntemps, nwalkers, ndim, logl=logl, logp=logp,
betas=betas)
p0 = np.random.uniform(4, 6, size=(ntemps, nwalkers, ndim))
out = sampler.run_mcmc(p0, 200)
ln_evidence, ln_error = sampler.thermodynamic_integration_log_evidence()
evidence = np.exp(ln_evidence)
error = np.exp(ln_evidence) * ln_error
if plot:
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(15, 6))
ax1.plot(sampler.chain[0, :, :, 0].T, color="k", lw=0.1)
ax1.set_ylabel("Walker positions for $\mu$")
ax1.set_xlabel("Simulation step")
nburn=100
samples = sampler.chain[0, :, nburn:, :].reshape((-1, ndim))
xplot = np.linspace(0, 10, 100)
prior = [np.exp(logp(xi)) for xi in xplot]
ax2.plot(xplot, prior, label="prior")
ax2.hist(samples, bins=50, histtype="step", normed=True,
label="posterior", color="k", linewidth=2)
ax2.axvline(x[0], label="data point", color="r")
ax2.legend(frameon=False, loc="best")
ax2.set_xlabel("$\mu$ posterior")
mean_lnlikes = np.mean(np.mean(sampler.lnlikelihood, axis=1), axis=1)
betas = sampler.betas
ax3.semilogx(betas, mean_lnlikes, "-o")
ax3.set_xlabel(r"$\beta$")
ax3.set_ylabel("Mean likelihood")
plt.tight_layout()
plt.show()
print "Z = {:1.4f} +/- {:1.4f}".format(
evidence, error)
else:
return evidence, error
```

### Example¶

Okay now let's fix our prior volume as $[a, b] = [0, 10]$ so we should find $Z = 0.1$. We will run the simulation with such a uniform prior and let's have a look at the results:

```
run('unif', 0, 10)
```

### A little explanation:¶

`run`

produces 3 plots: the first is the direct results of the simulations, the second is the estimated posterior along with the data point and prior, finaly there is a plot of the mean likelihood used to estimate the evidence.

In essence you can ignore the first and last plots and focus on the posterior and the evidence quoted at the bottom.

### Back to buisness¶

Okay so as expected we get $Z = 0.1$ and a posterior centered on the data point. In addition the posterior is zero at 0 and 10 so we should expect agreement with the analytic result. Excellent!

### Breaking the result¶

Just for fun, let's set $[a, b]$ = [4, 6] so, according to the analytic result, we should get $Z = 1/2$. But, because the likelihood is still finite at the edges of the prior volume this is not the case.

```
run('unif', 4, 6)
```

## Gaussian Prior¶

Okay now that we are confident that `run`

does what it should (computes Z), let's use it with a prior that we can't easily do analytically. Note that the last two arguments to `run`

are the `loc`

and `scale`

of the normal in that order:

### Informative Gaussian prior¶

Let's try an informative prior: a Gaussian centered on 5 with a small scale parameter

```
run('norm', 5, 1)
```

### Non-informative Gaussian prior¶

A typical way to make Gaussians non-informative is to set a large scale parameter (where large is in relation to the size of the mean).

```
run('norm', 0, 50, betamin=-7)
```

### Mis-informative Gaussian prior¶

Now let's consider a prior which is informative, but does not support the data. To be specific we will use a normal prior for $\mu$ with mean at 2.5 and a relatively small scale factor:

```
run('norm', 2.5, 1.0)
```

The posterior sits half way between the data point and the prior. The evidence is substantially weakened (by an order of magnitude) when compared to the informative Gaussian.

### Plotting the evidence as a function of the mean¶

Let's fix the prior variance on $\mu$ at 1 and vary the mean then look at how this changes the evidence

```
mu_mean = np.linspace(-5, 15, 50)
Z = []
Zerr = []
for mm in mu_mean:
evi, err = run('norm', mm, 1.0, plot=False)
Z.append(evi)
Zerr.append(err)
```

```
from scipy.stats import norm
fig, ax = plt.subplots()
ax.plot(mu_mean, Z, "-")
plt.show()
```