Most efficient numerical selectBetween
Varying samples
(Updated to include internal function)
For varying samples, I think the best non-compiled method is to use Pick
as you did. You can speed up the selector list slightly by using Subtract
, and you can reduce the number of arithmetical operations by 1 as follows:
sb2[samp_, {min_, max_}] := Pick[
samp,
UnitStep[Subtract[samp, min] Subtract[samp, max]],
0
]
In comparison to your answer, sb2
has 2 subtractions, 1 multiplication and 1 UnitStep
call, while selectBetween
has 2 subtractions, 1 addition and 2 UnitStep
calls.
Comparison:
samp = RandomReal[10, 10^6];
r1 = selectBetween[samp, {.001, .01}]; //RepeatedTiming
r2 = sb2[samp, {.001, .01}]; //RepeatedTiming
r1===r2
{0.00690, Null}
{0.0059, Null}
True
Slightly faster.
Varying sample addendum 1
Here is a different selector that is slightly faster:
sb3[samp_, {min_, max_}] := Pick[
samp,
Unitize @ Clip[samp, {min, max}, {0,0}],
1
]
Comparison:
samp = RandomReal[10, 10^6];
r1 = selectBetween[samp, {.001, .01}]; //RepeatedTiming
r2 = sb3[samp, {.001, .01}]; //RepeatedTiming
r1===r2
{0.0069, Null}
{0.0055, Null}
True
Note that the selector would need to be modified if the interval could contain 0.
Varying sample addendum 2
If you don't mind using an undocumented internal function, you could try:
samp = RandomReal[10, 10^6];
r1 = selectBetween[samp, {.001, .01}]; //RepeatedTiming
r2 = Random`Utilities`SelectWithinRange[samp, {.001, .01}]; //RepeatedTiming
r1 === r2
{0.00688, Null}
{0.0019, Null}
True
Fixed sample
If the sample stays the same, and you are selecting different ranges of data, then the following will be much faster:
nf = Nearest[samp]; //AbsoluteTiming
r3 = nf[(.01+.001)/2, {All, (.01-.001)/2}];//RepeatedTiming
Sort @ r1 === Sort @ r3
{0.153673, Null}
{0.000071, Null}
True
Creating the NearestFunction
is slow, but only needs to be done once, and then using the NearestFunction
will be extremely fast.
BoolEval was made exactly for these kinds of problems. Suppose you want to select elements from array
which are between lo
and hi
. Then use:
Needs["BoolEval`"]
BoolPick[array, lo < array < hi]
Simple, concise, and as fast as it gets for an unsorted list (with documented functions). Using <=
instead of <
is also possible, and it is handled correctly.