Length of generator output

The easiest way is probably just sum(1 for _ in gen) where gen is your generator.


For those who would like to know the summary of that discussion. The final top scores for counting a 50 million-lengthed generator expression using:

  • len(list(gen)),
  • len([_ for _ in gen]),
  • sum(1 for _ in gen),
  • ilen(gen) (from more_itertools),
  • reduce(lambda c, i: c + 1, gen, 0),

sorted by performance of execution (including memory consumption), will make you surprised:

#1: test_list.py:8: 0.492 KiB
    gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))
('list, sec', 1.9684218849870376)

#2: test_list_compr.py:8: 0.867 KiB
    gen = (i for i in data*1000); t0 = monotonic(); len([i for i in gen])
('list_compr, sec', 2.5885991149989422)

#3: test_sum.py:8: 0.859 KiB
    gen = (i for i in data*1000); t0 = monotonic(); sum(1 for i in gen); t1 = monotonic()
('sum, sec', 3.441088170016883)

#4: more_itertools/more.py:413: 1.266 KiB
    d = deque(enumerate(iterable, 1), maxlen=1)
   
    test_ilen.py:10: 0.875 KiB
    gen = (i for i in data*1000); t0 = monotonic(); ilen(gen)
('ilen, sec', 9.812256851990242)

#5: test_reduce.py:8: 0.859 KiB
    gen = (i for i in data*1000); t0 = monotonic(); reduce(lambda counter, i: counter + 1, gen, 0)
('reduce, sec', 13.436614598002052)

So, len(list(gen)) is the most frequent and less memory consumable