Speed of a simple operation over a long list
SeedRandom[1234];
list = RandomReal[1, 10000000];
Length[Select[list, # < 0.5 &]] // RepeatedTiming
Total[UnitStep[0.5 - list]] // RepeatedTiming
{4.77, 4998698}
{0.0698, 4998698}
And skipping multiplying list
with -1.
, these are even a bit faster:
Length[list] - Total[UnitStep[list - 0.5]] // RepeatedTiming
Total[UnitStep[Subtract[0.5, list]]] // RepeatedTiming
{0.048, 4998698}
{0.050, 4998698}
Building on a compiled version by @ecoxlinux from this post (linked by Kuba), we can also define
countLessThan = Compile[{{vector, _Real, 1}, {bound, _Real}},
Block[{counter = 0},
Do[counter+=Boole[Compile`GetElement[vector, i]<bound], {i, 1, Length[vector]}];
counter],
CompilationTarget -> "C", "RuntimeOptions" -> "Speed"]
and use
countLessThan[list, 0.5] // RepeatedTiming
{0.0042, 4998698}
If the list is shorter than 10^16 the MachinePrecision
numbers one gets from Clip
won't be too imprecise:
Total[(Sign[list - 0.5] + 1)/2] // AbsoluteTiming
Round[Clip[list, {0.5, 0.5}, {0, 1}] // Total] // AbsoluteTiming
{4.8582906, 5002895}
{0.13461055, 5002895}
The fastest non-compiled approach is the one shown by @Henrik Schumacher, but I would like to point out a few subtleties about it.
You asked how many elements are less than 0.5. One of the proposed answers was Total@UnitStep@Subtract[0.5, list]
. However, this counts how many elements are less than or equal to 0.5. Note that UnitStep[0]
is 1
. The other proposed solution, Length[list] - Total@UnitStep@Subtract[list, 0.5]
, is the correct one. Or one that is somewhat more amenable to generalization: Total[1 - UnitStep@Subtract[list, 0.5]]
.
The biggest problem with this approach is that it is hard to write and hard to read. As the answers above demonstrate, it is very easy to make small mistakes. Consider a more complex situation, such as how many elements are within interval [a, b) ∪ (c, d]
or similar. It quickly becomes painful to write these expressions.
I have a small package called BoolEval that helps with this: all it does is translate relational and Boolean operations to the kinds of arithmetic expressions Henrik has shown.
<<BoolEval`
BoolCount[list < 0.5] // RepeatedTiming
(* {0.067, 4998698} *)
BoolCount[list <= 0.5] // RepeatedTiming
(* {0.047, 4998698} *)
Total@UnitStep@Subtract[0.5, list] // RepeatedTiming
(* {0.048, 4998698} *)