Does Java 8 provide a good way to repeat a value or function?
For this specific example, you could do:
IntStream.rangeClosed(1, 8)
.forEach(System.out::println);
If you need a step different from 1, you can use a mapping function, for example, for a step of 2:
IntStream.rangeClosed(1, 8)
.map(i -> 2 * i - 1)
.forEach(System.out::println);
Or build a custom iteration and limit the size of the iteration:
IntStream.iterate(1, i -> i + 2)
.limit(8)
.forEach(System.out::println);
Here's another technique I ran across the other day:
Collections.nCopies(8, 1)
.stream()
.forEach(i -> System.out.println(i));
The Collections.nCopies
call creates a List
containing n
copies of whatever value you provide. In this case it's the boxed Integer
value 1. Of course it doesn't actually create a list with n
elements; it creates a "virtualized" list that contains only the value and the length, and any call to get
within range just returns the value. The nCopies
method has been around since the Collections Framework was introduced way back in JDK 1.2. Of course, the ability to create a stream from its result was added in Java SE 8.
Big deal, another way to do the same thing in about the same number of lines.
However, this technique is faster than the IntStream.generate
and IntStream.iterate
approaches, and surprisingly, it's also faster than the IntStream.range
approach.
For iterate
and generate
the result is perhaps not too surprising. The streams framework (really, the Spliterators for these streams) is built on the assumption that the lambdas will potentially generate different values each time, and that they will generate an unbounded number of results. This makes parallel splitting particularly difficult. The iterate
method is also problematic for this case because each call requires the result of the previous one. So the streams using generate
and iterate
don't do very well for generating repeated constants.
The relatively poor performance of range
is surprising. This too is virtualized, so the elements don't actually all exist in memory, and the size is known up front. This should make for a fast and easily parallelizable spliterator. But it surprisingly didn't do very well. Perhaps the reason is that range
has to compute a value for each element of the range and then call a function on it. But this function just ignores its input and returns a constant, so I'm surprised this isn't inlined and killed.
The Collections.nCopies
technique has to do boxing/unboxing in order to handle the values, since there are no primitive specializations of List
. Since the value is the same every time, it's basically boxed once and that box is shared by all n
copies. I suspect boxing/unboxing is highly optimized, even intrinsified, and it can be inlined well.
Here's the code:
public static final int LIMIT = 500_000_000;
public static final long VALUE = 3L;
public long range() {
return
LongStream.range(0, LIMIT)
.parallel()
.map(i -> VALUE)
.map(i -> i % 73 % 13)
.sum();
}
public long ncopies() {
return
Collections.nCopies(LIMIT, VALUE)
.parallelStream()
.mapToLong(i -> i)
.map(i -> i % 73 % 13)
.sum();
}
And here are the JMH results: (2.8GHz Core2Duo)
Benchmark Mode Samples Mean Mean error Units
c.s.q.SO18532488.ncopies thrpt 5 7.547 2.904 ops/s
c.s.q.SO18532488.range thrpt 5 0.317 0.064 ops/s
There is a fair amount of variance in the ncopies version, but overall it seems comfortably 20x faster than the range version. (I'd be quite willing to believe that I've done something wrong, though.)
I'm surprised at how well the nCopies
technique works. Internally it doesn't do very much special, with the stream of the virtualized list simply being implemented using IntStream.range
! I had expected that it would be necessary to create a specialized spliterator to get this to go fast, but it already seems to be pretty good.