Expected value of the number of flips until the first head
Let $X$ be a discrete random variable with possible outcomes: $x_1, x_2, x_3,\dots, x_i,\dots$ with associated probabilities $p_1,p_2,p_3,\dots,p_i,\dots$
The expected value of $f(X)$ is given as:
$E[f(X)] = \sum\limits_{i\in\Delta} f(x_i)p_i$
In your specific example, $X$ could be one of the values: $1,2,3,\dots ,i,\dots$ with corresponding probabilities $\frac{1}{2},\frac{1}{4},\frac{1}{8},\dots,\frac{1}{2^i},\dots$ (seen easily from the beginnings of a tree diagram)
So, the expected value of $X$ is:
$\sum\limits_{i=1}^\infty i(\frac{1}{2})^i$
This is a well-known infinite sum of the form $\sum\limits_{i=1}^\infty i p (1-p)^{i-1}$, in this case with $p=1-p=\frac{1}{2}$. You will likely be expected to simply memorize the result, and it is included in most formula lists. $\sum\limits_{i=1}^\infty i p (1-p)^{i-1}=\frac{1}{p}~~~~~~~(\dagger)$
Using this result without proof, we get our expected number of flips is $\frac{1}{0.5}=2$
The proof of $(\dagger)$:
$$\sum\limits_{i=1}^\infty i p (1-p)^{i-1} = p\sum\limits_{i=1}^\infty i (1-p)^{i-1}\\ = p\left(\sum\limits_{i=1}^\infty (1-p)^{i-1} + \sum\limits_{i=2}^\infty (1-p)^{i-1} + \sum\limits_{i=3}^\infty (1-p)^{i-1} + \dots\right)\\ = p\left[(1/p)+(1-p)/p+(1-p)^2/p+\dots\right]\\ = 1 + (1-p)+(1-p)^2+\dots\\ =\frac{1}{p}$$
Let X
be the number of flips we make before seeing heads. Let p
be the probability of getting tails on any one flip. And note three things that, individually, aren't too hard to see:
- Coin flipping is a memoryless process.
- We must flip the coin at least once in order to get heads (or tails, or anything).
- The probability of not getting heads on that first flip is
p
.
Using these three ideas, we can model E[X]
, the expected number of flips we perform before seeing heads, as:
E[X] = 1 + pE[X]
.
That is, we flip the coin once (hence the 1
), and there is a p
chance that we get tails and have to continue flipping the coin (hence the + pE[X]
). Because coin flipping is memoryless, the expected value of X
after flipping a tails is the same as before flipping that tails.
Now it turns out that modeling E[X]
in this way is useful because we can easily solve for E[X]
and answer our question:
E[X] = 1 + pE[X]
=> E[X] - pE[X] = 1
=> (1 - p)E[X] = 1
=> E[X] = 1 / (1 - p)
Since p = 1/2
, we get E[X] = 2
.
I learned of this clever approach here: http://online.stanford.edu/course/algorithms-design-and-analysis-part-1
Let $x$ be the expected number of flips. Then after the first flip half the time we stop and the other half the time we continue. That gives $x = 1 + \frac{1}{2}(0) + \frac{1}{2}(x)$ and so $x = 2$ or $x$ is infinite. The latter is impossible (see note below) and hence $x = 2$.
Note
Given any such iterative process where each iteration is identical and independent of the previous ones and has nonzero probability of stopping, and the desired quantity being the cumulative sum of values of each iteration, where there are upper and lower bounds on the value of all iterations, the desired quantity has finite expectation, for the reason that a series similar to the infinite sum in the other answers converges. The point is that we just have to prove this fact once, and subsequently we can just use the above analysis to handle far more complicated processes without any infinite summation at all.