Bayesian Inference with Continuous prior distribution
I would like to take some slight shortcuts with regard to Romke's excellent solution while making use of the comfortable features of Mathematica's statistical framework.
Prior
$\theta|I \sim Beta(6,14)$
We can interpret this (conjugate) prior as having seen (or believing in) 18 Bernoulli-trials and having observed 5 successes and 13 failures before making the actual experiment (part of our background information $I$) and code this as:
priorDist = With[ { a = 6, b = 14 }, BetaDistribution[a, b] ];
prior = Function[ θ, PDF[ priorDist ] @ θ ];
Plot[ prior[θ], {θ, 0, 1}, Filling -> Axis, Axes-> {True,False} ]
Likelihood
$y|\theta, I \sim B(n, \theta)$
Given a series of $n = 1830$ Bernoulli-trials with $y = 420$ successes we will have a the Binomal PDF as a likelihood function:
likelihood = With[
{
n = 1830,
y = 420
},
Function[θ, Likelihood[ BinomialDistribution[n, θ], {y} ] ]
];
Plot[likelihood[θ], {θ, 0, 1}, Filling -> Axis, Axes -> {True, False}, PlotRange -> All]
As Romke pointed out, this function is a PDF depending on $\theta$ but not a probability distribution. So it does not integrate to 1:
NIntegrate[ likelihood[θ], {θ, 0, 1} ]
0.00054615
Posterior
$p(\theta|y,I) \propto p(y|\theta,I) \cdot p(\theta|I)$
Bayes-Rule tells us, that the PDF of the posterior distribution equals the product of the likelihood function and the prior PDF up to a normalizing constant (often called the evidence), which Romke has used in his explicit answer:
$p(y|I) = \int_0^1 p(y|\theta,I) p(\theta|I)$
Being a bit more lazy we can make use of the fact that ProbabilityDistribution
has an option Method -> "Normalize"
which will take care to adjusting the PDF to meet the requirements of a probability distribution:
posteriorDistribution = ProbabilityDistribution[
prior[θ] likelihood[θ],
{θ, 0, 1},
Method -> "Normalize"
];
Plot[ PDF[ posteriorDistribution ][θ], {θ,0,1},
Filling->Axis, Axes->{True,False}, PlotRange->All]
After so many additional data, the posterior (of course) is very close to the likelihood. We can immediately get some summarizing statistics:
Through[{Mean, StandardDeviation}@posteriorDistribution] // N
{0.23027, 0.00978554}
RandomVariate
will also work properly, so that we can sample from the posterior:
RandomVariate[ posteriorDistribution, 10 ]
{0.222989, 0.227517, 0.218535, 0.228404, 0.226061, 0.235919, 0.234529, 0.230528, 0.22929, 0.247735}
Since the prior and the posterior distributions here have the same functional form (conjugate prior), we may have immediately used the fact that:
$\theta|y,I \sim Beta(a + y, b + n - y)$
Equal[
PDF[ BetaDistribution[6 + 420, 14 + 1830 - 420], θ ],
PDF[posteriorDistribution, θ]
] // Reduce
True
You could also look at likelihood[θ] prior[θ]//Simplify
to see that the posterior does have the form:
$constant \times \theta^\alpha (1 - \theta)^\beta$
Update
Note that in general the normalizing constant (e.g. the evidence) can be obtained as an expectation:
$p(y|I) = \int p(y,\theta|I)d\theta = \int p(y|\theta,I) p(\theta|I)d\theta = \mathbb{E}[ p(y|\theta,I) ] = \mathbb{E}[\mathcal{L}(\theta)]$
posteriorPDF = With[
{ evidence = Expectation[ likelihood[θ], θ \[Distributed] priorDist ] },
Function[
θ,
prior[θ] likelihood[θ] / evidence
]
];
posteriorPDF[θ] == PDF[BetaDistribution[6 + 420, 14 + 1830 - 420], θ] // Reduce
True
The posterior distribution for p is a Beta [426, 1424]. You can graph it as you did yourself.
Here is how you can compute in Mathematica.
In[1]:= bin = Binomial[n, x]*p^x*(1 - p)^(n - x)
Out[1]= (1 - p)^(n - x) p^x Binomial[n, x]
In[2]:= prior = p^(a - 1)*(1 - p)^(b - 1)/Beta[a, b]
Out[2]= ((1 - p)^(-1 + b) p^(-1 + a))/Beta[a, b]
In[3]:= den =
Integrate[bin*prior, {p, 0, 1},
Assumptions -> {a > 0, b > 0, x >= 0, n >= x}]
Out[3]= (Binomial[n, x] Gamma[b + n - x] Gamma[a + x])/(
Beta[a, b] Gamma[a + b + n])
In[4]:= bin*prior/den // FullSimplify
Out[4]= ((1 - p)^(-1 + b + n - x) p^(-1 + a + x) Gamma[a + b + n])/(
Gamma[b + n - x] Gamma[a + x])
I would like to extend asim's solution, which is a function, to a probability distribution.
ClearAll[evidence, likelihood, prior, posterior];
likelihood = p^x*(1 - p)^(n - x) (* is always unnormalized ! *)
prior = p^(a - 1)*(1 - p)^(b - 1)/Beta[a, b] (* always normalized *)
evidence = Integrate[likelihood*prior, {p, 0, 1},
Assumptions -> {a > 0, b > 0, x >= 0, n >= x}] (* normalisation of posterior *)
posterior = prior*likelihood/evidence // FullSimplify
(* Out[]= *)
(1 - p)^(n - x) p^x
((1 - p)^(-1 + b) p^(-1 + a))/Beta[a, b]
(Gamma[b + n - x] Gamma[a + x])/(Beta[a, b] Gamma[a + b + n])
((1 - p)^(-1 + b + n - x) p^(-1 + a + x) Gamma[a + b + n])/(Gamma[b + n - x] Gamma[a + x])
Nothing new so far. Now for the probability distribution:
ClearAll[posteriorDistribution];
posteriorDistribution[a_, b_, x_, n_] = ProbabilityDistribution[posterior, {p, 0, 1},
Assumptions -> {a > 0, b > 0, x >= 0, n >= x}];
PDF[posteriorDistribution[6, 14, 420, 1830], 0.21] (* test at p=0.21 *)
4.6369
Check this answer by asim's solution in BetaDistribution[] form:
PDF[posteriorDistribution[6, 14, 420, 1830], 0.21] == PDF[BetaDistribution[426, 1424], 0.21]
True
This probability distribution can be plotted by:
Plot[PDF[posteriorDistribution[6, 14, 420, 1830], p], {p, 0, 1}, PlotRange -> All]
Several statistical properties work fine:
Mean[posteriorDistribution[6, 14, 420, 1830]] // N
StandardDeviation[posteriorDistribution[6, 14, 420, 1830]] // N
0.23027
0.00978554
But RandomVariate[] does not:
RandomVariate[posteriorDistribution[6, 14, 420, 1830], 5]
RandomVariate[
ProbabilityDistribution[
25100175092366680639705706981674366907407305357564535864428789165880\
7774460815201505720742948687064397263210935672027225834571883138571552\
9435073323698076733502611129508985779654636172308803925192878804616791\
4163296382700546574839309987790038929990089862577986787654011552037119\
0043901814728663569787527433974843693348538658624431557074602596292870\
1098499374967102449446582880605618133093761170547891966862458477624898\
28252384878666048 (1 - \[FormalX])^1423 \[FormalX]^425, {\[FormalX],
0, 1}, Assumptions -> {True, True, True, True}], 5]
Anyone here who knows what is going wrong?
Of course, the BetaDistribution[] works fine:
RandomVariate[BetaDistribution[426, 1424], 5]
{0.224794, 0.246947, 0.225743, 0.241723, 0.219218}