How to compute conditional probabilities?
Here's a brute force way: Use the definition of continuous conditional density function:
dist12 = BinormalDistribution[{μ1, μ2}, {σ1, σ2}, ρ]
dist2 = MarginalDistribution[dist12, 2]
conditionalDensity = PDF[dist12, {x1, x2}]/PDF[dist2, x2]
$$\frac{\exp \left(\frac{(\text{x2}-\text{$\mu $2})^2}{2 \text{$\sigma $2}^2}-\frac{\frac{(\text{x1}-\text{$\mu $1})^2}{\text{$\sigma $1}^2}-\frac{2 \rho (\text{x1}-\text{$\mu $1}) (\text{x2}-\text{$\mu $2})}{\text{$\sigma $1} \text{$\sigma $2}}+\frac{(\text{x2}-\text{$\mu $2})^2}{\text{$\sigma $2}^2}}{2 \left(1-\rho ^2\right)}\right)}{\sqrt{2 \pi } \sqrt{1-\rho ^2} \text{$\sigma $1}}$$
Now...in this case will Mathematica automatically recognize this as a Normal distribution? Maybe not:
d1 = ProbabilityDistribution[conditionalDensity, {x1, -Infinity, Infinity}]
Update: An alternative approach
Here's another way to get the conditional density. First determine the conditional distribution function and then differentiate to get the conditional probability density function:
cdf = Probability[y1 <= x1 \[Conditioned] y2 == x2,
{y1, y2} \[Distributed] BinormalDistribution[{μ1, μ2}, {σ1, σ2}, ρ]]
pdf = D[cdf, x1]
Unfortunately David G. Stork has deleted his answer; I would nevertheless pick up his answer which imo can be "cured" rather easily.
I would believe that the following holds:
\begin{align} p(x_1 | x_2) \propto p(x1,x2|x2) \end{align}
which would mean that, as Dr. Stork has written, the joint probability density $p(x_1, x_2)$ (here the bivariate normal density) evaluated for a variable $x_1$ given a fixed value for $x_2$ will be equal to the conditional probability density $p(x_1|x_2)$ up to a normalizing constant.
We can see this from adding the option Method -> "Normalize"
to the former answer given by Dr. Stork:
condDist = ProbabilityDistribution[
PDF[ BinormalDistribution[ {μ1, μ2}, {σ1, σ2}, ρ] ], {x1, x2} ],
{ x1, -∞, +∞ },
Method -> "Normalize"
]
(Note that there are some warnings which likely call for some additional assumptions we should make using the option Assumptions
, but this should not worry us here too much.)
Comparing this expression with the answer obtained by @JimBaldwin will show that it is completely equivalent up to some expression which does contain neither $x_1$ nor $x_2$ and thus is a mere constant.
We can use this propotionality for example if we factor a complicated joint probability distribution (for example a Bayesian Network) into a multiplicative term of conditional and marginal probabilities. Here we may use the joint probability densities in the way indicated without the costly normalization, as we could do that only once at the end of our inference.