Difference between Iterator and Spliterator in Java8
The names are pretty much self-explanatory, to me. Spliterator
== Splittable Iterator : it can split some source, and it can iterate it too. It roughly has the same functionality as an Iterator
, but with the extra thing that it can potentially split into multiple pieces: this is what trySplit
is for. Splitting is needed for parallel processing.
An Iterator
always has an unknown size: you can traverse elements only via hasNext/next
; a Spliterator
can provide the size (thus improving other operations too internally); either an exact one via getExactSizeIfKnown
or a approximate via estimateSize
.
On the other hand, tryAdvance
is what hasNext/next
is from an Iterator
, but it's a single method, much easier to reason about, IMO. Related to this, is forEachRemaining
which in the default implementation delegates to tryAdvance
, but it does not have to always be like this (see ArrayList
for example).
A Spliterator also is a "smarter" Iterator, via its internal properties like DISTINCT
or SORTED
, etc (which you need to provide correctly when implementing your own Spliterator
). These flags are used internally to disable unnecessary operations; see for example this optimization:
someStream().map(x -> y).count();
Because size does not change in the case of the stream, the map
can be skipped entirely, since all we do is counting.
You can create a Spliterator around an Iterator if you need to, via:
Spliterators.spliteratorUnknownSize(yourIterator, properties)
An Iterator
is a simple representation of a series of elements that can be iterated over.
eg:
List<String> list = Arrays.asList("Apple", "Banana", "Orange");
Iterator<String> i = list.iterator();
i.next();
i.forEachRemaining(System.out::println);
#output
Banana
Orange
A Spliterator
can be used to split given element set into multiple sets so that we can perform some kind of operations/calculations on each set in different threads independently, possibly taking advantage of parallelism. It is designed as a parallel analogue of Iterator. Other than collections, the source of elements covered by a Spliterator could be, for example, an array, an IO channel, or a generator function.
There are 2 main methods in the Spliterator
interface.
- tryAdvance() and forEachRemaining()
With tryAdvance(), we can traverse underlying elements one by one (just like Iterator.next()). If a remaining element exists, this method performs the consumer action on it, returning true; else returns false.
For sequential bulk traversal we can use forEachRemaining():
List<String> list = Arrays.asList("Apple", "Banana", "Orange");
Spliterator<String> s = list.spliterator();
s.tryAdvance(System.out::println);
System.out.println(" --- bulk traversal");
s.forEachRemaining(System.out::println);
System.out.println(" --- attempting tryAdvance again");
boolean b = s.tryAdvance(System.out::println);
System.out.println("Element exists: "+b);
Output:
Apple
--- bulk traversal
Banana
Orange
--- attempting tryAdvance again
Element exists: false
- Spliterator trySplit()
Splits this spliterator into two and returns the new one:
List<String> list = Arrays.asList("Apple", "Banana", "Orange");
Spliterator<String> s = list.spliterator();
Spliterator<String> s1 = s.trySplit();
s.forEachRemaining(System.out::println);
System.out.println("-- traversing the other half of the spliterator --- ");
s1.forEachRemaining(System.out::println);
Output:
Banana
Orange
-- traversing the other half of the spliterator ---
Apple
An ideal trySplit method should divide its elements exactly in half, allowing balanced parallel computation.
The splitting process is termed as 'partitioning' or 'decomposition' as well.