Distribution of maximum proportion of heads in an infinite sequence of flips

The second question, and in fact a version for general Bernoulli variables (i.e., also for biased coins), is completely solved, with a somewhat explicit formula of the distribution of $Y$, in the paper Wolfgang Stadje, The Maximum Average Gain in a Sequence of Bernoulli Games, The American Mathematical Monthly, Vol. 115, No. 10 (Dec., 2008), pp. 902-910.

The main result is for a sequence of independent variables $X_i'$ with $\mathbb{P}(X_i' = 1) = p \in (0,1)$ and $\mathbb{P}(X_i' = -1) = q = 1-p$, and corresponding $S_n' = \sum_{i=1}^n X_i'$ and $Y' = \sup \frac{S_n'}{n}$, which are obvious affine transformations of $X_i$, $S_n$, and $Y$ as defined in the question.

Results:

  1. $\mathbb{P}(Y' \in (p-q,1] \cap \mathbb{Q}) = 1$;
  2. $\mathbb{P}(Y' = x) > 0$ for every $x \in (p-q,1]$;
  3. $\mathbb{P}(Y' = 1) = p$;
  4. $\mathbb{P}(Y' = 0) = 0$ if $p \ge 1/2$;
  5. $\mathbb{P}(Y' = 0) = p(1-p/q)$ if $p<1/2$;

The description of the general distribution is a little more complicated, here it is: If $r/s \in (p-q,1)$ is a rational number in reduced form, with positive denominator, let $f(z) = pz^{2s} - z^{s+r} + q$. This is a polynomial of degree $2s$, so it has $2s$ complex zeros $z_0, \ldots, z_{2s-1}$. The paper shows that $s+r$ of the roots are in the closed unit disk, among those always $z=1$, and in the case that $s+r$ is even also $z = -1$, so we can assume $z_0=1$, $|z_i| \le 1$ for $1 \le i \le s+r-1$, and $|z_i|>1$ for $s+r \le i \le 2s-1$. If $s+r$ is even, we additionally assume that $z_1 = -1$. With this setup, the results are

  1. $\mathbb{P}(Y' \le r/s) = \prod\limits_{i=r+s}^{2s-1} (1-z_i^{-1})$;
  2. $\mathbb{P}(Y' = r/s) = \prod\limits_{i=r+s}^{2s-1} (1-z_i^{-1}) + p \prod\limits_{i=r+s}^{2s-1} (1-z_i)$

From (6) with $r=1$ and $s$ large, one should be able to get something about your third question, but I have not tried to see whether this is easy or hard.


I'm not sure all the terms you might want to search under, but I would include "random walk", since $P(Y\ge \frac{1}{2}+\epsilon)$ is just the probability of falling off the proverbial bridge if it is infinitely wide to the left and getting wider to the right along a diagonal with slope controlled by $\epsilon$.


Here is a picture of the distribution of $Y$ on $(\frac 12; 1)$ (I am not showing the top half which is just a vertical segment at $Y=1$)

enter image description here

This plot was made with all the fractions of denominator up to $200$. For $r = p/q$, there are $n=q-p$ roots to find, which you can apparently get with Newton's method starting at the $\exp((-1+2ik \pi)/n)$ for $k=0 \ldots n-1$. That everything lines up properly likes this should be evidence that this heuristic works.

Using the data from all those points gives $0.77 \le E[Y] \le 0.85$ (the gap corresponds to the missing probability mass, I realize now I could be a lot more precise here).

I think we can prove, looking at $r = 1- \frac 1n$ and the behaviour of the only small root $x_n$ of $1-2x+x^n$, that there is a horizontal tangent at $(1,\frac 12)$ (in fact it should converge exponentially to $\frac 12$).

To study the behaviour at $Y \approx \frac 12$, one would need to study the roots of $1 - 2x^n + x^{2n+1}$, which seems a bit harder .