How can I shorten the calculation time effectively?

This should have the same effect and runs in about 7.6 seconds.

AbsoluteTiming[

 m = 10^7;
 l = 4;
 n = RandomSample[1 ;; m, 2 10^5];
 u = Normal[SparseArray[Partition[n, 1] -> 1, {m}]];
 sum = Total@ParallelTable[
    Total[u[[RandomChoice[1 ;; m, 10^l]]]],
    {10^(9 - l)},
    Method -> "CoarsestGrained"
    ]

 ]

The important points are:

  • Using RandomSample[1 ;; m, ...] and RandomChoice[1 ;; m, ...]. This allows us to handle also rather large m.

  • Conversion of the very expensive Count operation (which goes through the whole list n $10^9$ times!) into a simple read operation from the (packed!) array u.

  • Chopping the vector r into small pieces so that it never has to be stored in RAM; if I computed correctly, r would need more than 7 GB of RAM. Since the pseudorandom numbers in the array returned from RandomChoice are pairwise independent, we may just call RandomChoice[1 ;; m, 10^l] multiple times, do the counts and sum then up in the end with Total.


Instead of building the long list r of random numbers, you could sample directly from the distribution of results: replace the third line of your code with

RandomVariate[BinomialDistribution[Length[r], Length[n]/Length[all]]]

to get a result instantaneously. Or is this cheating?