On thinking about independent events

Suppose that two events $A$ and $B$ are independent. Let's assume, for instance, that $A$ takes place one in each four occasions, whereas $B$ takes place one in each nine occasions. Then how often do they both occur? In each $36(=4\times9)$ occasions, $A$ occurs about $9$ times and $B$ occurs about $4$ times. If these times are distributed more or less uniformly, it is to be expectes that $A$ and $B$ occur simultaneously only once. That is, $A$ and $B$ occur simultaneously $1$ in each $36$ occasions.


The clearest way of understanding the notion of independence is certainly in terms of the conditional: $$ P(A\mid B) = P(A). $$ That is, knowledge of $B$ doesn't affect the probability of $A$.

But all hope is not lost upon using the definition $P(A\cap B)=P(A)P(B)$. This means that when trying to find the probability that both $A$ and $B$ occur, we need only find the probabilities of $A$ and $B$ "independently" and take their product. For instance, say you flip five coins and you want to find the probability that they are all heads. It's intuitive that, in order to do this, we find the probability that each is heads and take their product. This is because the probability that the second coin is heads (presumably) doesn't depend on whether the first was heads, and the third doesn't depend on the first two, and so forth.


If you think of a probability as the fraction of the time something happens, it is conceptually clear that the probability that two things happen must be no greater than the probability that either one of them happens individually. So $\Pr[A\cap B]\le \Pr[A]$ and $\Pr[A\cap B]\le\Pr[B]$.

The question is, in what way does $\Pr[A\cap B]$ get reduced relative to $\Pr[A]$ or $\Pr[B]$? Let's say $\Pr[A]=\frac{1}{3}$ and $\Pr[B]=\frac{2}{5}$, so that $A$ happens $\frac{1}{3}$ of the time and $B$ happens $\frac{2}{5}$ of the time. If one didn't think about it carefully, one might think that $B$ happens on $\frac{2}{5}$ of those occasions when $A$ happens and therefore that $A\cap B$ happens $\frac{1}{3}\cdot\frac{2}{5}=\frac{2}{15}$ of the time, in just the same way as $\frac{2}{5}$ of $\frac{1}{3}$ of a cake is $\frac{2}{15}$ of a cake. But that is, in fact, only true when $A$ and $B$ are independent. It could well be that $B$ happens whenever $A$ happens, so that $\Pr[A\cap B]=\Pr[A]\cdot1=\Pr[A]$. Or it could be that $B$ never happens when $A$ happens so that $\Pr[A\cap B]=\Pr[A]\cdot0=0$. In fact, $\Pr[A\cap B]$ in this example can take any value between $0$ and $\Pr[A]$ depending on whether the occurrence of $A$ makes $B$ less likely, leaves the probability of $B$ unchanged, or makes $B$ more likely.

Of course this discussion really hasn't got away from the conditional definition. The general formula for $\Pr[A\cap B]$ is $$ \Pr[A\cap B]=\Pr[A]\cdot\Pr[B\mid A]. $$ Independence is precisely the condition that the occurrence of $A$ makes $B$ neither more nor less likely, $\Pr[B\mid A]=\Pr[B]$, so that the formula becomes $\Pr[A\cap B]=\Pr[A]\cdot\Pr[B]$. Nevertheless, I hope that this way of thinking about it makes the multiplicative definition of independence just as intuitive as the conditional definition.

Added note: I have changed the numbers in the example above because the lower bound on $\Pr[A\cap B]$ was false using the original numbers. In general, the truth is slightly more complicated: if $\Pr[B]\ge\Pr[A]$ then $\Pr[A\cap B]$ can take any value between $\max(0,\Pr[A]+\Pr[B]-1)$ and $\Pr[A]$. My original numbers were $\Pr[A]=\frac{3}{5}$ and $\Pr[B]=\frac{2}{3}$. Since the sum of these probabilities is greater than $1$, it is not possible for $A$ and $B$ to be disjoint, i.e. for $\Pr[A\cap B]$ to be $0$. In this example the intersection is as small as possible when $\Pr[A\cap B]=\Pr[A]+\Pr[B]-1$, which is where the lower bound in this case comes from. So for these numbers, $\Pr[A\cap B]$ must lie between $\frac{3}{5}+\frac{2}{3}-1=\frac{4}{15}$ and $\Pr[A]=\frac{3}{5}$.