Itertools zip_longest with first item of each sub-list as padding values in stead of None by default
You can peek into each of the iterators via next
in order to extract the first item ("head"), then create a sentinel
object that marks the end of the iterator and finally chain
everything back together in the following way: head -> remainder_of_iterator -> sentinel -> it.repeat(head)
.
This uses it.repeat
to replay the first item ad infinitum once the end of the iterator has been reached, so we need to introduce a way to stop that process once the last iterator hits its sentinel
object. For this we can (ab)use the fact that map
stops iterating if the mapped function raises (or leaks) a StopIteration
, such as from next
invoked on an already exhausted iterator. Alternatively we can use the 2-argument form of iter
to stop on a sentinel
object (see below).
So we can map the chained iterators over a function that checks for each item whether it is sentinel
and performs the following steps:
if item is sentinel
then consume a dedicated iterator that yields one item fewer than the total number of iterators vianext
(hence leakingStopIteration
for the last sentinel) and replace thesentinel
with the correspondinghead
.else
just return the original item.
Finally we can just zip
the iterators together - it will stop on the last one hitting its sentinel
object, i.e. performing a "zip-longest".
In summary, the following function performs the steps described above:
import itertools as it
def solution(*iterables):
iterators = [iter(i) for i in iterables] # make sure we're operating on iterators
heads = [next(i) for i in iterators] # requires each of the iterables to be non-empty
sentinel = object()
iterators = [it.chain((head,), iterator, (sentinel,), it.repeat(head))
for iterator, head in zip(iterators, heads)]
# Create a dedicated iterator object that will be consumed each time a 'sentinel' object is found.
# For the sentinel corresponding to the last iterator in 'iterators' this will leak a StopIteration.
running = it.repeat(None, len(iterators) - 1)
iterators = [map(lambda x, h: next(running) or h if x is sentinel else x, # StopIteration causes the map to stop iterating
iterator, it.repeat(head))
for iterator, head in zip(iterators, heads)]
return zip(*iterators)
If leaking StopIteration
from the mapped function in order to terminate the map
iterator feels too awkward then we can slightly modify the definition of running
to yield an additional sentinel
and use the 2-argument form of iter
in order to stop on sentinel
:
running = it.chain(it.repeat(None, len(iterators) - 1), (sentinel,))
iterators = [...] # here the conversion to map objects remains unchanged
return zip(*[iter(i.__next__, sentinel) for i in iterators])
If the name resolution for sentinel
and running
from inside the mapped function is a concern, they can be included as arguments to that function:
iterators = [map(lambda x, h, s, r: next(r) or h if x is s else x,
iterator, it.repeat(head), it.repeat(sentinel), it.repeat(running))
for iterator, head in zip(iterators, heads)]