Probability that n sequences agree in a certain percentage of them
We are going to imagine that the set of all our rolls form a grid.
So if I rolled HTTH, TTTH, HHHH. The grid created would be:
HTTH
TTTH
HHHH
Now when we are looking for match we can just look at each column of our grid individually. If every element in the column is the same we have a match.
Now we can rephrase our problem in terms of our grid
We want to calculate the probability of getting:
- $m$ matching columns
- In a grid with $k$ columns and $n$ rows
- With $p$ being the probability any element is a heads
We can know view our problem as a simple case of binomial probability with each of our columns being a trial.
There are $\binom{k}{m}$ ways we can select the $m$ columns we want as our matching columns.
The probability of getting a matching column is $p^n+(1-p)^n$ since we can have all heads or all tails in our column.
And of course the probability of not having a match in a column is $1-(p^n+(1-p)^n)$
This leaves us with
$$\binom{k}{m}(p^n+(1-p)^n)^m(1-p^n-(1-p)^n)^{k-m}$$
As our final probability.
I also wrote a program in java to verify this and the formula seems to be correct you can look at the code yourself here.