Applying law of total probability to conditional probability

can someone generalize it, so as to make my understanding more clear? Say for $n$ events?

If $(B_k)_n$ is a sequence of $n$ events that partition the sample space (or if at least $(B_k\cap A_1)_n$ partitions $A_1$) then, $\mathsf P(A_2\mid A_1) = \sum_{k=1}^n \mathsf P(A_2\mid A_1\cap B_k)\mathsf P(B_k\mid A_1)$

Also, in $P(A_2|A_1)=P(A_2|\color{red}{AA_1})P(\color{red}{A|A_1})+P(A_2|\color{magenta}{A^cA_1})P(\color{magenta}{A^c|A_1})$, I feel red colored stuff should be same and pink colored stuff should be same, as in case of simple form law of total probability.

They are not the same in the case of the simple form. So why should they be?

Where $\Omega$ is the entire sample space, then:

$${{\mathsf P(A_2)~}{= \mathsf P(A_2\mid \Omega)\\=\mathsf P(A_2\mid \color{red}{A}, \Omega)P(\color{red}{A}\mid \Omega)+\mathsf P(A_2\mid \color{magenta}{A^c}, \Omega)\,\mathsf P(\color{magenta}{A^c}\mid \Omega)\\=\mathsf P(A_2\mid \color{red}{A})P(\color{red}{A})+\mathsf P(A_2\mid \color{magenta}{A^c})\,\mathsf P(\color{magenta}{A^c})}}$$

I felt it should be $P(A_2|\color{red}{(A_1|A)})P(\color{red}{A_\,\mathsf 1|A})+P(A_2|\color{magenta}{(A_1|A^c)})P(\color{magenta}{A_1|A^c})$. Am I absolutely stupid here?

:) Well, I would not say absolutely. But seriously, it is a rather common misunderstanding.

The conditioning bar is not a set operation. It seperates the event from the condtion that the probability function is being measured over. There can only be one inside any probability function; they do not nest.

For a moment I felt its related to:$P(E_1E_2E_2...E_n)=P(E_1)P(E_2|E_1)P(E_3|E_1E_2)...P(E_n|E_1...E_{n-1})$. Is it so?

Yes, this is so. Specifically $\mathsf P(A_2,A,A_1)=\mathsf P(A_2\mid A,A_1)\mathsf P(A\mid A_1)\mathsf P(A_1)\\ \mathsf P(A_2,A^\mathsf c,A_1)=\mathsf P(A_2\mid A^\mathsf c,A_1)\mathsf P(A^\mathsf c\mid A_1)\mathsf P(A_1)$

$$\begin{align}\mathsf P(A_2\mid A_1) ~ & = \mathsf P((A\cup A^\mathsf c){\cap} A_2\mid A_1) && \text{Union of Complements} \\[1ex] & = \mathsf P((A{\cap}A_2)\cup(A^\mathsf c{\cap}A_2)\mid A_1) && \text{Distributive Law} \\[1ex] & = \mathsf P(A{\cap}A_2\mid A_1) + \mathsf P(A^\mathsf c{\cap}A_2\mid A_1) && \text{Additive Rule for Union of Exclusive Events} \\[1ex] & = \dfrac{\mathsf P(A{\cap}A_1{\cap}A_2)+\mathsf P(A^\mathsf c{\cap}A_1{\cap}A_2)}{\mathsf P(A_1)} && \text{by Definition} \\[1ex] & = \dfrac{\mathsf P(A_2\mid A{\cap}A_1)\,\mathsf P(A{\cap}A_1)+\mathsf P(A_2\mid A^\mathsf c{\cap}A_1)\,\mathsf P(A^\mathsf c{\cap}A_1)}{\mathsf P(A_1)} && \text{by Definition} \\[1ex] & = {\mathsf P(A_2\mid A{\cap}A_1)\,\mathsf P(A\mid A_1)+\mathsf P(A_2\mid A^\mathsf c{\cap}A_1)\,\mathsf P(A^\mathsf c\mid A_1)} && \text{by Definition of Conditional Probability} \end{align}$$

Applying law of total probability to conditional probability

Tags:

Probability

Related

Recent Posts