How to build a list of elements satisfying a certain property without building a list of all elements first?
You can use Do
together with Reap
-Sow
:
Reap[Do[If[EvenQ[i], Sow[i]], {i, 100}]][[2, 1]] // Short
(*{2, 4, 6, 8, 10, <<40>>, 92, 94, 96, 98, 100}*)
EDIT
It is interesting to compare the Reap
-Sow
approach with the linked lists approach.
Reap
-Sow
approach:
ClearAll[ReapSow];
SetAttributes[ReapSow, HoldAll];
ReapSow[expr_, cond_, spec_] := Reap[
Do[If[cond, Sow[expr]], spec]
][[2, 1]];
Linked lists approach
ClearAll[LinkedLists];
SetAttributes[LinkedLists, HoldAll];
LinkedLists[expr_, cond_, spec_] := CompoundExpression[
$temp = f[],
Do[
If[cond, $temp = f[$temp, expr]],
spec
],
List @@ Flatten[$temp]
];
here I intentionally do not introduce local variables not to fool RepeatedTiming
by, e.g. Module
overhead.
Internal`Bag
cheat:
ClearAll[ReapSowCompiled];
SetAttributes[ReapSowCompiled, HoldAll];
ReapSowCompiled[expr_, cond_, spec_] := Compile[
{},
Module[
{bag = Internal`Bag[{0}]},
Do[If[cond, Internal`StuffBag[bag, expr]], spec];
Internal`BagPart[bag, All]
],
CompilationTarget -> "C"
];
cf = ReapSowCompiled[i, EvenQ[i], {i, 10^4}];
Tests in a fresh kernel:
(*A fresh kernel*)
mem1 = MemoryInUse[];
a = ReapSow[i, EvenQ[i], {i, 10^4}]; // RepeatedTiming // First
MemoryInUse[] - mem1
(*0.01*)
(*58992*)
(*A fresh kernel*)
mem1 = MemoryInUse[];
b = LinkedLists[i, EvenQ[i], {i, 10^4}]; // RepeatedTiming // First
MemoryInUse[] - mem1
(*0.01*)
(*410696*)
(*A fresh kernel*)
mem1 = MemoryInUse[];
c = cf[] // Rest; // AbsoluteTiming// First
MemoryInUse[] - mem1
(*0.000326986*)
(*47184*)
So, a complex tree structure of pointers seems to involve a bigger memory overhead then an optimized Reap
-Sow
or Compiled
cheat.
As compiled variant of Sow
and Reap
, you can use the undocumented Internal`Bag
:
cf = Compile[{{n, _Integer}},
Module[{bag = Internal`Bag[{0}]},
Do[If[EvenQ[i], Internal`StuffBag[bag, i]], {i, n}];
Internal`BagPart[bag, All]
],
CompilationTarget -> "C"
];
n = 10000000;
a = Reap[Do[If[EvenQ[i], Sow[i]], {i, n}]][[2, 1]]; // AbsoluteTiming // First
b = Rest[cf[n]]; // AbsoluteTiming // First
a == b
5.39257
0.214701
True