Zipping unequal lists in python in to a list which does not drop any element from longer list being zipped
Normally, you use itertools.zip_longest
for this:
>>> import itertools
>>> a = [1, 2, 3]
>>> b = [9, 10]
>>> for i in itertools.zip_longest(a, b): print(i)
...
(1, 9)
(2, 10)
(3, None)
But zip_longest
pads the shorter iterable with None
s (or whatever value you pass as the fillvalue=
parameter). If that's not what you want then you can use a comprehension to filter out the None
s:
for i in (tuple(p for p in pair if p is not None)
for pair in itertools.zip_longest(a, b)):
print(i)
but note that if either of the iterables has None
values this will filter them out too. If you don't want that, define your own object for fillvalue=
and filter that instead of None
:
_marker = object()
def zip_longest_no_fill(a, b):
for i in itertools.zip_longest(a, b, fillvalue=_marker):
yield tuple(x for x in i if x is not _marker)
list(zip_longest_no_fill(a, b)) # [(1, 9), (2, 10), (3,)]
On Python 2, use itertools.izip_longest
instead.
Another way is map
:
a = [1, 2, 3]
b = [9, 10]
c = map(None, a, b)
Although that will too contain (3, None)
instead of (3,)
. To do that, here's a fun line:
c = (tuple(y for y in x if y is not None) for x in map(None, a, b))
It's not too hard to just write the explicit Python to do the desired operation:
def izip_short(a, b):
ia = iter(a)
ib = iter(b)
for x in ia:
try:
y = next(ib)
yield (x, y)
except StopIteration:
yield (x,)
break
for x in ia:
yield (x,)
for y in ib:
yield (None, y)
a = [1, 2, 3]
b = [9, 10]
list(izip_short(a, b))
list(izip_short(b, a))
I wasn't sure how you would want to handle the b
sequence being longer than the a
sequence, so I just stuff in a None
for the first value in the tuple in that case.
Get an explicit iterator for each sequence. Run the a
iterator as a for
loop, while manually using next(ib)
to get the next value from the b
sequence. If we get a StopIteration
on the b
sequence, we break the loop and then for x in ia:
gets the rest of the a
sequence; after that for y in ib:
will do nothing because that iterator is already exhausted. Alternatively, if the first for x in ia:
loop exhausts the a
iterator, the second for x in ia:
does nothing but there could be values left in the b
sequence and the for y in ib:
loop collects them.