conditions in Select
You could for example use
res1 = Select[
data,
Function[{list}, AllTrue[Drop[list, {8}], list[[8]] < # &]]
]
but this is not the fastest way to do it in Mathematica. This might be perform better:
{{eigth}, rest} = TakeDrop[Transpose[data], {8}];
sel = Total /@ Transpose@UnitStep[ConstantArray[eigth, Length[rest]] - rest];
res2 = Pick[data, sel, 0];
The result is the same:
res1 == res2
True
A straightforward solution (and correction of J. M.'s comment code):
SeedRandom[0]
data = RandomInteger[9, {100, 10}];
Select[data, #[[8]] < Min @ Drop[#, {8}] &]
{{3, 9, 4, 7, 2, 1, 2, 0, 7, 8}, {9, 5, 9, 9, 3, 6, 6, 1, 3, 3}, {3, 5, 2, 2, 6, 8, 9, 1, 3, 7}, {9, 6, 7, 8, 8, 7, 9, 1, 7, 6}}
This is twice as fast as C. E.'s AllTrue
code:
data = RandomInteger[9, {10000, 15}];
Select[data, Function[{list}, AllTrue[Drop[list, {8}], list[[8]] < # &]]] //
Length // RepeatedTiming
Select[data, #[[8]] < Min @ Drop[#, {8}]] // Length // RepeatedTiming
{0.045, 273} {0.021, 273}
It is still and order of magnitude behind his Pick
method however. Here is a tuned version of that code that can be more than twice as fast.
Now faster and cleaner after reading LLlAMnYP's answer and recognizing a simplification.
select[data_, n_] := (
Subtract[data[[All, n]], data]
// UnitStep
// Total[#, {2}] &
// Pick[data, #, 1] &
)
SeedRandom[0]
data = RandomInteger[9, {1*^6, 15}];
select[data, 8] // Length // RepeatedTiming
(* his code *) // Length // RepeatedTiming
{0.141, 28205} {0.302, 28205}
Methods include explicit Subtract
; reference:
- Why are numeric division and subtraction not handled better in Mathematica?
Even though there is already an accepted answer, the problem at hand lends itself well to a compiled approach for performance gains.
compiledSelect =
Compile[{{a, _Integer, 2}},
Total[Transpose[UnitStep[-a + a[[1 ;; -1, 8]]]]],
CompilationTarget -> "C", Parallelization -> True, "RuntimeOptions" -> "Speed"]
selectLL[data_] := Pick[data, compiledSelect[data], 1]
Comparing this to Mr.Wizard's best solution:
data = RandomInteger[9, {1*^6, 15}];
selectLL[data] == select[data, 8]
selectLL[data] // Length // RepeatedTiming
select[data, 8] // Length // RepeatedTiming
True {0.212, 27950} {0.220, 27950}
It has a very marginal edge of a few percent in speed. In essence, this is a sort of refactoring of Mr.Wizard's code, minimizing the necessary manipulations, but only Compile
lets it be faster.
EDIT
After carefully considering Mr.Wizard's reference I included an explicit Subtract
as well:
compiledSelect2 =
Compile[{{a, _Integer, 2}},
Total[Transpose[UnitStep[Subtract[a[[1 ;; -1, 8]], a]]]]
, CompilationTarget -> "C", Parallelization -> True,
"RuntimeOptions" -> "Speed"]
selectLL2[data_] := Pick[data, compiledSelect2[data], 1]
Now including performance tests for Mr.Wizard's simplified function (which I call select2
here). I also leave the original function (see his edit history) for comparison purposes.
data = RandomInteger[9, {1*^6, 15}];
selectLL[data] == select[data, 8] == select2[data, 8] == selectLL2[data]
True
Benchmarking with repeatedly new data:
(Table[data = RandomInteger[9, {1*^6, 15}];
{selectLL[data] // Length // RepeatedTiming // First,
selectLL2[data] // Length // RepeatedTiming // First,
select[data, 8] // Length // RepeatedTiming // First,
select2[data, 8] // Length // RepeatedTiming // First}, {10}]
// Transpose
// Map[Append[#, Mean@#] &]
// Prepend[#, Range[10]~Join~{"Avg."}] &
// Transpose
// Join[{{"N.", "selectLL", "selectLL2", "select", "select2"}}, #] &
// Grid
All the functions used in our routines are certainly implemented in low-level code where Compile
can hardly give much of an edge. As we see, a compiled //Transpose//Total
loses out to the uncompiled Total[..., {2}]
.
A quick shot at "improving" (maybe in performance, certainly not in readability) Mr.W's code by removing all explicit Function
s:
select3[data_, n_] :=
Pick[data, Total[Subtract[data[[All, n]], data] // UnitStep, {2}], 1]
Table[data = RandomInteger[9, {1*^6, 15}];
select3[data, 8] // Length // RepeatedTiming // First, {10}]
{0.195, 0.194, 0.194, 0.195, 0.194, 0.194, 0.194, 0.195, 0.195, 0.194}
Very marginally better, probably not statistically significant.
TODO:
Were the input transposed, could the compiled function be more efficient?
After some tests, it doesn't look that way.
EDIT:
I managed to find a fully compiled version that performs on par with the other solutions. Still not as fast as select2
though.
compiledSelect3 = Compile[{{a, _Integer, 2}},
a[[
Flatten@
Position[
Total[
Transpose[
UnitStep[Subtract[a[[All, 8]], a]]
]
],
1
]
]]
, CompilationTarget -> "C", Parallelization -> True,
"RuntimeOptions" -> "Speed"]
Head-to-head with select2
:
Table[data = RandomInteger[9, {1*^6, 15}];
(compiledSelect3[data] // RepeatedTiming // First) -
(select2[data, 8] // RepeatedTiming // First), {10}]
Mean@%
{0.003, 0.004, 0.008, 0.006, 0.006, 0.006, 0.005, 0.006, 0.006, 0.*10^-3} 0.006
3% slower. Close, but no cigar.