a problem on Polya's urn scheme

The situation is easy to understand in the special case where $b=r$, since the whole situation is then symmetric between the two colors. Parts of the situation are asymmetric, for example the situation after you draw a blue ball on the first trial and add $c$ more blue balls. Obviously, blue is then favored. But there is an equally likely situation that equally favors red, so the overall situation is still symmetric.

Let me now try to explain a similar intuition in the non-symmetric case where $b\neq r$. Imagine that the $b+r$ balls initially in the urn are not only colored with two colors but also numbered, with $b+r$ distinct numbers, say numbers $1$ to $b$ for the blue balls and $b+1$ to $b+r$ for the red balls. Also, imagine that whenever a ball is drawn in a trial and is put back into the urn, the additional $c$ balls have not only the same color but also the same number as the ball that was drawn. (So, although the initial situation had different numbers on all the balls, later situations will have many balls with the same number.) Notice that, throughout the process, all balls numbered $1$ to $b$ will be blue and all balls numbered $b+1$ to $b+r$ will be red.

Now temporarily forget about colors and concentrate on numbers. The initial situation is symmetric between all $b+r$ numbers. Just as in the first paragraph above, the overall situation remains symmetric throughout the process. So all $b+r$ numbers have the same chance of being drawn in the $n$-th trial.

As before, there are asymmetries in conditional probabilities, for example the second ball drawn is more likely to have the same number as the first than to have any other particular number. But, as before, the asymmetries are equally likely to favor any of the numbers. So the overall probabilities are equal.

But (taking the colors into consideration again) this means that the probability of drawing a blue ball on the $n$th trial is $b/(b+r)$ because $b$ of the $b+r$ equally likely numbers correspond to blue balls.


It's roughly like the following.

  • Easy case: Suppose there were equal numbers of each colour. Then how could the probability - assuming you don't know what's come before - ever be anything other than a half? For every eventuality that you end up picking a blue ball, there's an exactly complementary possibility in which the colours are all reversed. The contents of the bag will typically be skewed, but the probability of it being skewed either way is the same.

  • General case: Essentially a restatement of a proof by induction. The first turn is easy. What about the second turn? Think about it like this: either you draw a ball which was in the bag at the start, or you draw a ball which was added after the first turn. If it was the first case, then the probability is just the original one. On the other hand, if you drew a new ball, it is necessarily of the colour drawn in the previous (first) turn. But the probability this is blue is just the same as the probability the first ball drawn was blue again!

In fact, this is a conceptual proof: the ball drawn on the $n$th turn is from a distribution over colours which we can break down according to when each ball entered the bag. But the probability distributions for each turn are determined to be exactly those of all previous turns and so nothing can ever change!