Does Java 8 provide a good way to repeat a value or function?

For this specific example, you could do:

IntStream.rangeClosed(1, 8)
         .forEach(System.out::println);

If you need a step different from 1, you can use a mapping function, for example, for a step of 2:

IntStream.rangeClosed(1, 8)
         .map(i -> 2 * i - 1)
         .forEach(System.out::println);

Or build a custom iteration and limit the size of the iteration:

IntStream.iterate(1, i -> i + 2)
         .limit(8)
         .forEach(System.out::println);

Here's another technique I ran across the other day:

Collections.nCopies(8, 1)
           .stream()
           .forEach(i -> System.out.println(i));

The Collections.nCopies call creates a List containing n copies of whatever value you provide. In this case it's the boxed Integer value 1. Of course it doesn't actually create a list with n elements; it creates a "virtualized" list that contains only the value and the length, and any call to get within range just returns the value. The nCopies method has been around since the Collections Framework was introduced way back in JDK 1.2. Of course, the ability to create a stream from its result was added in Java SE 8.

Big deal, another way to do the same thing in about the same number of lines.

However, this technique is faster than the IntStream.generate and IntStream.iterate approaches, and surprisingly, it's also faster than the IntStream.range approach.

For iterate and generate the result is perhaps not too surprising. The streams framework (really, the Spliterators for these streams) is built on the assumption that the lambdas will potentially generate different values each time, and that they will generate an unbounded number of results. This makes parallel splitting particularly difficult. The iterate method is also problematic for this case because each call requires the result of the previous one. So the streams using generate and iterate don't do very well for generating repeated constants.

The relatively poor performance of range is surprising. This too is virtualized, so the elements don't actually all exist in memory, and the size is known up front. This should make for a fast and easily parallelizable spliterator. But it surprisingly didn't do very well. Perhaps the reason is that range has to compute a value for each element of the range and then call a function on it. But this function just ignores its input and returns a constant, so I'm surprised this isn't inlined and killed.

The Collections.nCopies technique has to do boxing/unboxing in order to handle the values, since there are no primitive specializations of List. Since the value is the same every time, it's basically boxed once and that box is shared by all n copies. I suspect boxing/unboxing is highly optimized, even intrinsified, and it can be inlined well.

Here's the code:

    public static final int LIMIT = 500_000_000;
    public static final long VALUE = 3L;

    public long range() {
        return
            LongStream.range(0, LIMIT)
                .parallel()
                .map(i -> VALUE)
                .map(i -> i % 73 % 13)
                .sum();
}

    public long ncopies() {
        return
            Collections.nCopies(LIMIT, VALUE)
                .parallelStream()
                .mapToLong(i -> i)
                .map(i -> i % 73 % 13)
                .sum();
}

And here are the JMH results: (2.8GHz Core2Duo)

Benchmark                    Mode   Samples         Mean   Mean error    Units
c.s.q.SO18532488.ncopies    thrpt         5        7.547        2.904    ops/s
c.s.q.SO18532488.range      thrpt         5        0.317        0.064    ops/s

There is a fair amount of variance in the ncopies version, but overall it seems comfortably 20x faster than the range version. (I'd be quite willing to believe that I've done something wrong, though.)

I'm surprised at how well the nCopies technique works. Internally it doesn't do very much special, with the stream of the virtualized list simply being implemented using IntStream.range! I had expected that it would be necessary to create a specialized spliterator to get this to go fast, but it already seems to be pretty good.

Tags:

Java

Java 8