Difference between Iterator and Spliterator in Java8

The names are pretty much self-explanatory, to me. Spliterator == Splittable Iterator : it can split some source, and it can iterate it too. It roughly has the same functionality as an Iterator, but with the extra thing that it can potentially split into multiple pieces: this is what trySplit is for. Splitting is needed for parallel processing.

An Iterator always has an unknown size: you can traverse elements only via hasNext/next; a Spliterator can provide the size (thus improving other operations too internally); either an exact one via getExactSizeIfKnown or a approximate via estimateSize.

On the other hand, tryAdvance is what hasNext/next is from an Iterator, but it's a single method, much easier to reason about, IMO. Related to this, is forEachRemaining which in the default implementation delegates to tryAdvance, but it does not have to always be like this (see ArrayList for example).

A Spliterator also is a "smarter" Iterator, via its internal properties like DISTINCT or SORTED, etc (which you need to provide correctly when implementing your own Spliterator). These flags are used internally to disable unnecessary operations; see for example this optimization:

 someStream().map(x -> y).count();

Because size does not change in the case of the stream, the map can be skipped entirely, since all we do is counting.

You can create a Spliterator around an Iterator if you need to, via:

Spliterators.spliteratorUnknownSize(yourIterator, properties)

An Iterator is a simple representation of a series of elements that can be iterated over.

eg:

 List<String> list = Arrays.asList("Apple", "Banana", "Orange");
 Iterator<String> i = list.iterator();
 i.next();
 i.forEachRemaining(System.out::println);

#output
Banana
Orange

A Spliterator can be used to split given element set into multiple sets so that we can perform some kind of operations/calculations on each set in different threads independently, possibly taking advantage of parallelism. It is designed as a parallel analogue of Iterator. Other than collections, the source of elements covered by a Spliterator could be, for example, an array, an IO channel, or a generator function.

There are 2 main methods in the Spliterator interface.

- tryAdvance() and forEachRemaining()

With tryAdvance(), we can traverse underlying elements one by one (just like Iterator.next()). If a remaining element exists, this method performs the consumer action on it, returning true; else returns false.

For sequential bulk traversal we can use forEachRemaining():

 List<String> list = Arrays.asList("Apple", "Banana", "Orange");
 Spliterator<String> s = list.spliterator();
 s.tryAdvance(System.out::println);
 System.out.println(" --- bulk traversal");
 s.forEachRemaining(System.out::println);

 System.out.println(" --- attempting tryAdvance again");
 boolean b = s.tryAdvance(System.out::println);
 System.out.println("Element exists: "+b);

Output:

Apple
 --- bulk traversal
Banana
Orange
 --- attempting tryAdvance again
Element exists: false

- Spliterator trySplit()

Splits this spliterator into two and returns the new one:

  List<String> list = Arrays.asList("Apple", "Banana", "Orange");

  Spliterator<String> s = list.spliterator();
  Spliterator<String> s1 = s.trySplit();

  s.forEachRemaining(System.out::println);
  System.out.println("-- traversing the other half of the spliterator --- ");
  s1.forEachRemaining(System.out::println);

Output:

Banana
Orange
-- traversing the other half of the spliterator ---
Apple

An ideal trySplit method should divide its elements exactly in half, allowing balanced parallel computation.

The splitting process is termed as 'partitioning' or 'decomposition' as well.

Difference between Iterator and Spliterator in Java8

Tags:

Iterator

Parallel Processing

Java 8

Spliterator

Related

Recent Posts