Bayesian Inference with Continuous prior distribution

I would like to take some slight shortcuts with regard to Romke's excellent solution while making use of the comfortable features of Mathematica's statistical framework.

Prior

$\theta|I \sim Beta(6,14)$

We can interpret this (conjugate) prior as having seen (or believing in) 18 Bernoulli-trials and having observed 5 successes and 13 failures before making the actual experiment (part of our background information $I$) and code this as:

priorDist = With[ { a = 6, b = 14 }, BetaDistribution[a, b] ];
prior = Function[ θ, PDF[ priorDist ] @ θ  ];

Plot[ prior[θ], {θ, 0, 1}, Filling -> Axis, Axes-> {True,False} ]

prior PDF

Likelihood

$y|\theta, I \sim B(n, \theta)$

Given a series of $n = 1830$ Bernoulli-trials with $y = 420$ successes we will have a the Binomal PDF as a likelihood function:

likelihood = With[
   {
     n = 1830,
     y = 420
   },
   Function[θ, Likelihood[ BinomialDistribution[n, θ], {y} ] ]
];

Plot[likelihood[θ], {θ, 0, 1}, Filling -> Axis, Axes -> {True, False}, PlotRange -> All]

Likelihood

As Romke pointed out, this function is a PDF depending on $\theta$ but not a probability distribution. So it does not integrate to 1:

NIntegrate[ likelihood[θ], {θ, 0, 1} ]

0.00054615

Posterior

$p(\theta|y,I) \propto p(y|\theta,I) \cdot p(\theta|I)$

Bayes-Rule tells us, that the PDF of the posterior distribution equals the product of the likelihood function and the prior PDF up to a normalizing constant (often called the evidence), which Romke has used in his explicit answer:

$p(y|I) = \int_0^1 p(y|\theta,I) p(\theta|I)$

Being a bit more lazy we can make use of the fact that ProbabilityDistribution has an option Method -> "Normalize" which will take care to adjusting the PDF to meet the requirements of a probability distribution:

posteriorDistribution = ProbabilityDistribution[
    prior[θ] likelihood[θ],
    {θ, 0, 1},
    Method -> "Normalize"
];

Plot[ PDF[ posteriorDistribution ][θ], {θ,0,1},
  Filling->Axis, Axes->{True,False}, PlotRange->All]

posterior

After so many additional data, the posterior (of course) is very close to the likelihood. We can immediately get some summarizing statistics:

Through[{Mean, StandardDeviation}@posteriorDistribution] // N

{0.23027, 0.00978554}

RandomVariate will also work properly, so that we can sample from the posterior:

RandomVariate[ posteriorDistribution, 10 ]

{0.222989, 0.227517, 0.218535, 0.228404, 0.226061, 0.235919, 0.234529, 0.230528, 0.22929, 0.247735}

Since the prior and the posterior distributions here have the same functional form (conjugate prior), we may have immediately used the fact that:

$\theta|y,I \sim Beta(a + y, b + n - y)$

Equal[
  PDF[ BetaDistribution[6 + 420, 14 + 1830 - 420], θ ],   
  PDF[posteriorDistribution, θ]
] // Reduce

True

You could also look at likelihood[θ] prior[θ]//Simplify to see that the posterior does have the form:

$constant \times \theta^\alpha (1 - \theta)^\beta$

Update

Note that in general the normalizing constant (e.g. the evidence) can be obtained as an expectation:

$p(y|I) = \int p(y,\theta|I)d\theta = \int p(y|\theta,I) p(\theta|I)d\theta = \mathbb{E}[ p(y|\theta,I) ] = \mathbb{E}[\mathcal{L}(\theta)]$

posteriorPDF = With[
   { evidence = Expectation[ likelihood[θ], θ \[Distributed] priorDist ] },
   Function[
     θ,
     prior[θ] likelihood[θ] / evidence
   ]
];

posteriorPDF[θ] == PDF[BetaDistribution[6 + 420, 14 + 1830 - 420], θ] // Reduce

True


The posterior distribution for p is a Beta [426, 1424]. You can graph it as you did yourself.

Here is how you can compute in Mathematica.

In[1]:= bin = Binomial[n, x]*p^x*(1 - p)^(n - x)

Out[1]= (1 - p)^(n - x) p^x Binomial[n, x]

In[2]:= prior = p^(a - 1)*(1 - p)^(b - 1)/Beta[a, b]

Out[2]= ((1 - p)^(-1 + b) p^(-1 + a))/Beta[a, b]

In[3]:= den = 
 Integrate[bin*prior, {p, 0, 1}, 
  Assumptions -> {a > 0, b > 0, x >= 0, n >= x}]

Out[3]= (Binomial[n, x] Gamma[b + n - x] Gamma[a + x])/(
Beta[a, b] Gamma[a + b + n])

In[4]:= bin*prior/den // FullSimplify

Out[4]= ((1 - p)^(-1 + b + n - x) p^(-1 + a + x) Gamma[a + b + n])/(
Gamma[b + n - x] Gamma[a + x])

I would like to extend asim's solution, which is a function, to a probability distribution.

ClearAll[evidence, likelihood, prior, posterior];
likelihood = p^x*(1 - p)^(n - x) (* is always unnormalized ! *)
prior = p^(a - 1)*(1 - p)^(b - 1)/Beta[a, b] (* always normalized *)
evidence = Integrate[likelihood*prior, {p, 0, 1},
Assumptions -> {a > 0, b > 0, x >= 0, n >= x}] (* normalisation of posterior *)
posterior = prior*likelihood/evidence // FullSimplify

(* Out[]= *)
(1 - p)^(n - x) p^x
((1 - p)^(-1 + b) p^(-1 + a))/Beta[a, b]
(Gamma[b + n - x] Gamma[a + x])/(Beta[a, b] Gamma[a + b + n])
((1 - p)^(-1 + b + n - x) p^(-1 + a + x) Gamma[a + b + n])/(Gamma[b + n - x] Gamma[a + x])

Nothing new so far. Now for the probability distribution:

ClearAll[posteriorDistribution];
posteriorDistribution[a_, b_, x_, n_] = ProbabilityDistribution[posterior, {p, 0, 1}, 
Assumptions -> {a > 0, b > 0, x >= 0, n >= x}];
PDF[posteriorDistribution[6, 14, 420, 1830], 0.21] (* test at p=0.21 *)

4.6369

Check this answer by asim's solution in BetaDistribution[] form:

PDF[posteriorDistribution[6, 14, 420, 1830], 0.21] == PDF[BetaDistribution[426, 1424], 0.21]

True

This probability distribution can be plotted by:

Plot[PDF[posteriorDistribution[6, 14, 420, 1830], p], {p, 0, 1}, PlotRange -> All]

Several statistical properties work fine:

Mean[posteriorDistribution[6, 14, 420, 1830]] // N
StandardDeviation[posteriorDistribution[6, 14, 420, 1830]] // N

0.23027
0.00978554

But RandomVariate[] does not:

RandomVariate[posteriorDistribution[6, 14, 420, 1830], 5]

RandomVariate[
 ProbabilityDistribution[
 25100175092366680639705706981674366907407305357564535864428789165880\
7774460815201505720742948687064397263210935672027225834571883138571552\
9435073323698076733502611129508985779654636172308803925192878804616791\
4163296382700546574839309987790038929990089862577986787654011552037119\
0043901814728663569787527433974843693348538658624431557074602596292870\
1098499374967102449446582880605618133093761170547891966862458477624898\
28252384878666048 (1 - \[FormalX])^1423 \[FormalX]^425, {\[FormalX], 
0, 1}, Assumptions -> {True, True, True, True}], 5]

Anyone here who knows what is going wrong?

Of course, the BetaDistribution[] works fine:

RandomVariate[BetaDistribution[426, 1424], 5]

{0.224794, 0.246947, 0.225743, 0.241723, 0.219218}