Is conditional probability also probability?

As the other respondents have already noted, probability is already defined as a ratio. But here is another thing to think about.

Normally when you take the ratio of two quantities, the result is measured in the units of the ratio of the units. For example, if you travel 10 meters in 5 seconds, your average velocity is 10/5 meters per second.

But because probability is defined as a ratio of two quantities that share the same unit, it is dimensionless. The probability of tossing a head (on one toss of a fair coin) is one half. It is not one half of toss, or one half of a prob (or 500 milliprobs, for that matter). It is just one half.

(Think of it as the long run ratio: The proportion of heads will be $n/2$ tosses per $n$ tosses. Or, if you prefer a more subjective approach, the fair price for a ticket that will win one dollar if the coin comes up heads is one half a dollar. The probability is thus defined as half a dollar per dollar, which again is just one half.)

It is automatically the case that the ratio of probabilities is also dimensionless. So, the ratio of probabilities has the same unit (i.e., nothing) as a single probability. This is in stark contrast to, say distance, where the ratio of two distances is dimensionless, and so evidently a very different thing to a single distance.


The formal definition of probability helps to elucidate this confusion. Think of it as a measure of subsets elements in a set, in such a way that the whole set measures up to one. More formally, these are the axioms of a probability:

Given a set of samples S, for every subset A, we have:

$1.P(A)\ge0$

$2.P(S) = 1$

$3.$ For a finite or infinite sequence of disjoint subsets $A_ i$: $P(\cup A_i)=\sum P(A_i)$

So you can understand conditional probability as simply changing your sample set S to a smaller one. We will create a new probability distribution that shall be proportional to the old one, but the sample set shall now be B. How should that new probability distribution be defined? Well, obviously the natural answer is:

$$P(A|B) = \frac{P(A\cap B)}{P(B)}$$


This is one of those fun areas where the logic of mathematics crosses back into the arena of philosophy from which it spawned. We can look at this many ways, and here are a couple of the more common ways that come to mind:


A part of a part of an apple is a part of an apple.

A probability, as commonly calculated, is the logical part of an observed set that is seen to be in common with itself and not in common with others in the observed set. We represent the probability mathematically as a proportion, which strips it of all units. The probability instead becomes a knife with which we separate one part from another part. Any multiplication or division done to the probability, is another separation done by the knife and does not change the nature of the knife nor of the whole that the knife cuts. The conditional probability then is merely two separations represented in a single statement, and the only change is in the size and shape of the resulting part of the whole.


A portion of a hole is a complete hole.

A probability, as commonly applied, becomes binary. Choices are weighed, and the likelihood of specific outcomes is considered. Regardless of probabilities theoretically adding up to a whole, each choice is assigned a binary condition of likelihood, either as likely or as unlikely. The probability itself contains nothing and means nothing when in isolation from context, and so a conditional probability contains an equal portion of nothing and means an equal amount of nothing.


I hope this isn't going to lead to the "p-value" conversation next.