Perl 6 list comprehension
How can I overcome it?
By choosing a better algorithm:
Let my \N = 1_000_000_000
. You are interested in the value
[+] grep * %% 5, 1..N
which is the same as
[+] map * * 5, 1..N/5
which is in turn
5 * [+] 1..N/5
Rakudo is smart enough to sum a range in constant time and you'll get your result (almost) instantly.
It's slow, because you're trying to reify a gajilion of items. As an alternative, you could construct a sequence: (5, 10 … 1000000000).sum
, but that's still reifying and keeping around a ton of elements, so it's still slow. You could make a C-style loop and add to the sum for each increment, but that's no fun (and still slowish for large enough numbers).
You can solve this with maths: the numbers divisible by N are multiples of it, and and if we factor out N out of that sequence, we find that the sum of all the numbers you're looking for is N * (1 + 2 + ... + floor 1000000000/N)
. Since that's a consecutive range up in there, we can use its Range.sum
(which knows how to do it fast) to calculate that part. Thus we get:
sub sum-of-stuff (\n, \d) { d * sum 1..n div d; }
say sum-of-stuff 1000000000, 5 # OUTPUT: 100000000500000000
So that's the fastest and sanest way to calculate your problem, but that's not the most fun!
You mentioned concurrency, so let's give that approach a whirl. Our problem is reifying stuff, so we need to figure out a way to chunk up our original range of numbers by the number of cores we have available, and then shoot up the work of reifying and finding multipliers to each individual core. We'll then sum up the stuff in each of the chunk in each of the core, and then come back to the main thread and sum up the sums of the chunks to get the final answers. In code, that'd look something like this:
constant N = 10000;
constant DIV = 5;
constant CORES = 32;
constant batch = N div CORES;
(0, batch … N, N).rotor(2 => -1).flat.map({$^a^..$^b})
.race(:batch :degree(CORES)).map(*.grep(* %% DIV).sum).sum.say;
The batch
is the size of the chunk each core needs to process and
here's the explanation of the one-liner that does all the work broken down for each bit:
We use the sequence operator to create a sequence of 0, batch, 2*batch, 3*batch
and so on, up to N
. Since we want N
to be part of it, we include it the second time:
(0, batch … N, N)
What we want now is to use that sequence to create a bunch of Range
objects, we want to re-use elements in the sequence, so we use .rotor
with batch of 2 and backstep of 1, then flatten the sublists and use .map
to create the Range objects (.map: *^..*
looks so much nicer, but alas, the Whatever Stars don't want to curry in that arrangement):
.rotor(2 => -1).flat.map({$^a^..$^b})
Now is the fun bit, we use the .race
method to create a HyperSeq
so it's evaluated using all of our cores. Its :batch
named argument lets you specify how many elements to process per batch and its :degree
specifies how many threads to use. We already batched up our data, so for :batch
we use 1
. And for :degree
we use the number of our cores. Why didn't we tell it to batch up our stuff? Because reification is our enemy and we want to reify in separate threads. Telling it to batch up would reify everything in one thread instead:
.race(:batch :degree(CORES))
So now we got our HyperSeq in hand, we can map over it. Each item is our Range
object sized for a batch, recall. So we'll call .grep
on it, looking for numbers divisible by the divisor we want, and then we'll call .sum
:
.map(*.grep(* %% DIV).sum)
The last cherry on the top, we want to sum up the sums of each chunk and print the result, so we call
.sum.say;
Tada!
You can also re-write the fun bit this way, and use Promises instead of .race
:
say sum await (0, batch … N, N).rotor(2 => -1).flat.map:
{ start sum grep * %% DIV, $^a^..$^b }
Similar and a bit shorter. The map that used to make Ranges for us now also fires up a Promise (using start
keyword) that greps and sums the chunk. At the start of the line, we added await
to wait for the result of all the promises and then sum up the results of that.
It's still slow as hell and won't help your original problem, hence why you should use a good algo instead ;)
Cheers.
How long does it take with a lower value than 1_000_000_000, say 1_000_000? On my machine that takes about 3 seconds. Extrapolating from that, 1000x as many values to check, would mean it would take something like 3000 seconds.
To make things faster, you could use the %% (is divisible by) operator, instead of % (modulo):
($_ if $_ %% 5 for 1..1000000000).sum
FWIW, I expected this to be faster. I will look into making %% faster, but I'm afraid you will still look at hour-scale times for your algorithm.