Probability of $n$ successes in a row at the $k$-th Bernoulli trial... geometric?

An analysis of the $n=2$ case is below. Let me explain why the general $n$ case is going to be ugly, bordering on impossible.

Let $P_k$ denote the probability that $n$ consecutive successes occur at trial $k$ for the first time. Since seeing a failure "restarts" the process, $P_k$ must satisfy the following recurrence relation for $k > n$:

$$P_k = q P_{k-1} + pq P_{k-2} + p^2 q P_{k-3}+ \cdots + p^{n-1} q P_{k-n}.$$ This is an $n$th order linear difference equation with constant coefficients. Thus its solution is of the form $P_k = \sum_{i=1}^n c_i r_i^k$, where the $r_i$s are the roots of the equation $$x^n - q x^{n-1} - pq x^{n-2} - \cdots - p^{n-1} q = 0.$$ So an explicit form for $P_k$ would require us to solve this equation for general $p$ and $q$. That is not likely to be possible in closed form. In fact, as we see below, the case when $n=2$ already gets quite ugly. The recurrence relation with the initial conditions $P_0 = P_1 = \cdots = P_{n-1} = 0, P_n = p^n$, might be the best we can do.


(Original answer.) For the $n = 2$ case, let $X$ denote the trial in which we see the second consecutive success for the first time. Let $P_k$ denote $P(X=k)$. I'll give

  1. An exact expression for $P_k$ via the solution to a recurrence for $P_k$,
  2. A binomial sum expression for $P_k$ via a combinatorial argument,
  3. The probability generating function $G(z)$ for $X$,
  4. $E[X]$, and
  5. $Var[X]$.

The method used to find the exact expression for $P_k$ and the method for finding the probability generating function can be extended to the general $n$ case, but this will get uglier and uglier as $n$ increases.


Expression 1. Since, except for SS, any sequence counted under $P_k$ must either start with F or with SF, we have the recurrence relation $P_k = q P_{k-1} + pq P_{k-2}$, for $k \geq 3$, with initial conditions $P_0 = P_1 = 0$, $P_2 = p^2$. This is a second-order linear difference equation with constant coefficients, and thus the solution is $P_k = A r_1^k + B r_2^k,$ where $r_1$ and $r_2$ are the roots of the auxiliary equation $x^2 - q x - pq = 0$, and $A$ and $B$ are found via the initial conditions. Crunching through the algebra, we get, for $k \geq 1$, $$P_k = \frac{p^2\left(\left(q + \sqrt{q^2+4pq}\right)^{k-1} - \left(q - \sqrt{q^2+4pq}\right)^{k-1}\right)}{2^{k-1}\sqrt{q^2+4pq}}.$$

Note that if $p = q = \frac{1}{2}$ the recurrence reduces to $P_k = \frac{1}{2} P_{k-1} + \frac{1}{4} P_{k-2}$. Thus, for $k \geq 1$, $P_k = \frac{F_{k-1}}{2^k}$, where $F_k$ is the $k$th Fibonacci number.


Expression 2.

We know that the last two trials have to be successes. Suppose, then, that there are $j$ successes in the first $k-2$ trials. Since every one of these $j$ successes must be followed by a failure, we need the number of ways to place $j$ times in $k-2$ locations a $p$ followed by a $q$. Since $pq$ must be treated as a pair, though, this is equivalent to the number of ways to place $j$ individual items into $k-2-j$ locations. The number of ways to do this is $\binom{k-2-j}{j}$, and so the probability that there are $j$ successes in the first $k-2$ trials followed by successes in the last two trials is $p^2 \binom{k-2-j}{j} (pq)^j q^{k-2-2j} = p^2 \binom{k-2-j}{j} p^j q^{k-2-j}$. Summing over all possible values of $j$ gives us $$P_k = p^2 \sum_{j = 0}^{k-2} \binom{k-2-j}{j} p^j q^{k-2-j}.$$

If $p = q = \frac{1}{2}$, we get, once again, $P_k = \frac{1}{2^k} \sum_{j = 0}^{k-2} \binom{k-2-j}{j} = \frac{F_{k-1}}{2^k}$, where the last step follows from the "sum of the shallow diagonals" property of Pascal's triangle. (See MathWorld's entry on Pascal's triangle, Eq. 16.)


The probability generating function, expected value, and variance.

We can also find the probability generating function. This is $$G(z) = \frac{p^2 z^2}{1-qz-pqz^2}.$$

A basic property of probability generating functions says that $P_k$ can be calculated via $P_k = G^{(k)}(0)/k!$.

If $p = q = \frac{1}{2}$, then we have $G(z) = \frac{(z/2)^2}{1-(z/2)-(z/2)^2} = z/2 F(z/2)$, where $F(z) = \frac{z}{1-z-z^2}$, the generating function for the Fibonacci numbers. Thus, once again, we see that $P_k = \frac{F_{k-1}}{2^k}$ in this case.

To derive the generating function, consider the "infinite sum" of possible ways to obtain two successes in a row in the last two trials: $$SS + FSS + FFSS + SFSS + FFFSS + FSFSS + SFFSS + \cdots.$$

Since, until the last two trials, an F can be followed by an S or an F, while an S must be followed by an F, every term in this sum can be decomposed uniquely into the product of terms chosen from {F, SF} followed by SS. Thus we can (formally) express this infinite sum as $$\sum_{k=0}^{\infty} (F + SF)^k SS = \frac{SS}{1 - F - SF}.$$ Taking $S = pz$ and $F = qz$ (i.e., count one trial for each success and one for each failure) gives the generating function $G(z)$.

Finally, we can use the probability generating function to calculate moments. For example, since the expected value is $G'(1^-)$ and the variance is $G''(1^-) + G'(1^-) - [G'(1^-)]^2$, a little calculus and a lot of simplifying shows that $$E[X] = \frac{1}{p^2} + \frac{1}{p}$$ and $$Var[X] = \frac{1}{p^4} + \frac{2}{p^3} - \frac{2}{p^2} - \frac{1}{p}.$$