How to construct a list of lengths efficiently
You can use the usual UnitStep
+ Total
tricks:
r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming
r1 === r2
{0.435358, Null}
{41.4357, Null}
True
Update
As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences
. Here is a version that uses Nearest
instead:
mincounts[s_] := With[
{
unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
},
With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
]
]
Comparison:
SeedRandom[1];
s=RandomInteger[{1,100000},10000]//Sort;
(* my first answer *)
r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
(* J42161217's answer *)
r2 = Flatten[
Join[
{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
]
][[;;10000]]; // AbsoluteTiming
(* using Nearest *)
r3 = mincounts[s]; //AbsoluteTiming
r1 === r2 === r3
{0.432897, Null}
{0.122198, Null}
{0.025923, Null}
True
BinCounts
and Accumulate
combination is faster than all the methods posted so far:
r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First
0.00069
versus Henrik Schumacher's mySparseArray
, Carl Woll's mincounts
and J42161217's Differences
-based method:
r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
1 ;; Length[s]]]
]; // RepeatedTiming // First
0.00081
r3 = mincounts[s]; // RepeatedTiming // First
0.016
r2 = Flatten[Join[{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i,
Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
RepeatedTiming // First
0.149
r2 == r3 == r4 == r5
True
I think this is at least x3 faster than Mr. Carl Woll's answer
Can anybody compare my timing?
r3 = Flatten[Join[{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i,
Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming
{0.157123, Null}
Using MapThread the same code is way faster
r6 = Flatten[
Join[{Table[0, s[[1]] - 1]},
MapThread[
Table, {Range[t = Length[Select[s, # <= 10000 &]]],
Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming
r6===r3
{0.008387, Null}
True