Green balls and Red balls, probability problem

All of the boxes contain $N - 1$ balls. This is just a complicated conditional probability problem. Lets look at a single box with $r$ red balls and $g$ green balls. What would the probability be of getting green on the second? Well it depends on whether or not you draw a red or green first. If you draw a red first, then there are $\left.p(\text{second green } \right| \text{ first red}) = \frac{g}{r + g - 1}$. However, if you draw a green ball first then you have one less green to choose from giving: $\left.p(\text{second green } \right| \text{ first green}) = \frac{g - 1}{g + r - 1}$. So what are the chances of each condition happening? $p(\text{first red}) = \frac{r}{g + r}$ and $p(\text{first green}) = \frac{g}{r + g}$. Therefore we can finally write:

\begin{align} p(\text{second green}) =& \left.p(\text{second green } \right| \text{ first red})p(\text{first red}) + \left.p(\text{second green } \right| \text{ first green})p(\text{first green})\\ =& \frac{r}{r + g}\frac{g}{r+g-1} + \frac{g}{r + g}\frac{g-1}{r+g-1} = \frac{g(r + g - 1)}{(r + g)(r + g - 1)} = \frac{g}{r + g} \end{align}

Not surprising that drawing the second green has just as good of a chance of being green as the first pick.

Therefore for each of the $N$ boxes you need to compute $p(\text{second green})$ (which is just the probability of drawing a green on the first try). Now the condition is that we choose box $r$ which has $p(\text{second green}) = p(\text{first green}) = \frac{N - r}{N - 1}$. The probability of choosing box $r$ among $N$ boxes is just $\frac{1}{N}$ which gives:

$$ p(\text{second green}) = \sum_1^N \frac{1}{N}\frac{N - r}{N - 1} = \frac{1}{N(N - 1)}\sum_1^r (N - r) $$

The first sum is very easy (you're just summing the same number, $N$, $N$ times) $\sum_1^N N = N\cdot N = N^2$. The second part is easy if you remember the sum of the first $n$ consecutive integers is $\sum_1^n i = \frac{n(n + 1)}{2}$. So this gives:

$$ p(\text{green}) = \frac{N^2 - \frac{N(N + 1)}{2}}{N(N - 1)} = \frac{2N^2 - N^2 - N}{2N(N - 1)} = \frac{N^2 - N}{2\left(N^2 - N\right)} = \frac{1}{2} $$

For part $2$), we actually already computed that above: $\left.p(\text{second green }\right|\text{ first green}) = \frac{g - 1}{g + r - 1}$. But now you need to sum over the condition that it could be any of the $N$ boxes (edit: However, the last box, box $N$, has $0$ green balls (and thus seeing green first means it definitely wasn't this box. So we should only sum over the first $N - 1$ boxes and divide by $N - 1$, not $N$.):

\begin{align} \left.p(\text{second green }\right|\text{ first green}) =& \sum_1^{N - 1} \frac{1}{N - 1}\frac{N - r - 1}{N - 2} \\ =& \frac{N(N - 1) - (N - 1) - \frac{N(N - 1)}{2}}{N(N - 2)} \\ =& \frac{2N(N - 1) - 2(N - 1) - N(N - 1)}{2(N - 1)(N - 2)}\\ =& \frac{N(N - 1) - 2(N - 1))}{2(N - 1)(N - 2)} \\ =& \frac{(N - 1)(N - 2)}{2(N - 1)(N - 2)} \\ =& \frac{1}{2} \end{align}

This is only valid for $N > 2$ (since if $N = 1$ there are no balls in each box and if $N = 2$ there is only one ball in each box). This result just confirms that drawing balls are independent events.


1) The number of green balls in total equals the number of red balls. Picking out a box at random, taking out $2$ balls and then looking at the second is actually 'the same' as picking out one ball out of 'big' box that contains all balls. The procedure followed has no influence at all on the chances of a ball to be picked out. Each of the balls has the same probability to show up as the 'elected one' here. So the probability that this ball is green is $\frac{1}{2}$. Likewise the probability that the first ball is green is also $\frac{1}{2}$. If $G_{i}$ denotes the event that the $i$-th ball taken out is green then $P\left(G_{1}\right)=P\left(G_{2}\right)=\frac{1}{2}$.

2) This is more complicated. To be found is $P\left(G_{2}\mid G_{1}\right)$ and based on 1) we can start with: $P\left(G_{2}\mid G_{1}\right)=\frac{P\left(G_{2}\cap G_{1}\right)}{P\left(G_{1}\right)}=2P\left(G_{2}\cap G_{1}\right)$. So now it comes to calculating $P\left(G_{2}\cap G_{1}\right)$.

For notational convenience we will calculate $P\left(R_{2}\mid R_{1}\right)$ instead of $P\left(G_{2}\mid G_{1}\right)$ This in the understanding that $P\left(R_{2}\mid R_{1}\right)=P\left(G_{2}\mid G_{1}\right)$ because of symmetry. Here $R_i$ denotes the event that the $i$-th ball drawn is red.

Let $R$ denote the index of the box that is chosen at random. Then $$P\left(R_{2}\cap R_{1}\right)=\sum_{r=1}^{N}P\left(R_{2}\cap R_{1}\mid R=r\right)P\left(R=r\right)=\frac{1}{N}\sum_{r=1}^{N}P\left(R_{2}\cap R_{1}\mid R=r\right)$$

Here $P\left(R_{2}\cap R_{1}\mid R=r\right)=\frac{r-1}{N-1}\frac{r-2}{N-2}$ so that $P\left(R_{2}\cap R_{1}\right)=\frac{1}{N\left(N-1\right)\left(N-2\right)}\sum_{r=3}^{N}\left(r-1\right)\left(r-2\right)$. By induction it can be shown that $\sum_{r=3}^{N}\left(r-1\right)\left(r-2\right)=\frac{1}{3}N\left(N-1\right)\left(N-2\right)$ leading to $P\left(R_{2}\cap R_{1}\right)=$$\frac{1}{3}$ and finally $P\left(R_{2}\mid R_{1}\right)=\frac{2}{3}$.

This result triggers the expectation that there is a 'shortcut' for this route, as there is in case 1).


To proceed with my last remark about a shortcut for case 2); here is some effort in that direction.

Think of boxes that contain ordered pairs of coloured balls. Every box contains $\left(N-1\right)\left(N-2\right)$ of these pairs. In box $r\in\left\{ 1,\dots,N\right\} $ there are $\left(r-1\right)\left(r-2\right)$ of the sort red-red, $\left(r-1\right)\left(N-r\right)$ of sort red-green, $\left(N-r\right)\left(r-1\right)$ of sort green-red and $\left(N-r\right)\left(N-r-1\right)$ of sort green-green. This ends up in total there are $N\left(N-1\right)\left(N-2\right)$ pairs. It can be calculated that $\frac{1}{3}N\left(N-1\right)\left(N-2\right)$ are of sort green-green, and off course $\frac{1}{2}N\left(N-1\right)\left(N-2\right)$ of the pairs has a green as first ball. As in 1) it now comes to picking an ordered pair out of a big box containing all ordered pairs of balls. This because each ordered pair has equal probability to be chosen. The procedure followed to get this far does not disturb that. That gives $$P\left\{ \text{second green}\mid\text{first green}\right\} =\frac{P\left\{ \text{second green and first green}\right\} }{P\left\{ \text{first green}\right\} }=\frac{\frac{1}{3}}{\frac{1}{2}}=\frac{2}{3}$$