What is the difference between $p(a,b)$ and $p(a|b)$?

I'm going to rephrase a little bit: $p(a,b)$ is the probability that both a and b happen. $p(a|b)$ is the probability that a happens, knowing that b has already happened.

I think the best way to think of these is to think of several examples.

Suppose we consider throwing 2 6-sided dice: suppose that condition 'A' is that the the numbers of the top faces of the two dice sum to 7, and 'B' is that die number 2 shows a 1.

Okay, now what is $p(a,b)$? Well, there is only 1 way in which this can happen: die 2 must show a 1, and the other a 6. As there are 36 possibilities that we all assume to have equal probability, $p(a,b) = 1/36$.

What is $p(a|b)$? So we know that die 2 is a 1. So the only way for the sum to be 7 is for die 1 to be a 6. As there are 6 possibilities for die 1, $p(a|b) = 1/6$.

Does that make sense?

Now, sometimes $p(a) = p(a|b)$, and this is when we call events a and b to be statistically independent.


$p(a|b)$ = the probability of event a happens given that the event b happens. The difference in words is critical. None of these have the sense of causation that due to implies. If b is unlikely, but a happens all the time b does, $p(a|b)$ can be quite high. If a is "I will be a millionaire tomorrow" and b is "I will win the lottery tonight", $p(a,b)$ is very low, but $p(a|b)$ is 1.