How to print the progress of a list comprehension in python?
tqdm
Using the tqdm
package, a fast and versatile progress bar utility
pip install tqdm
from tqdm import tqdm
def process(token):
return token['text']
l1 = [{'text': k} for k in range(5000)]
l2 = [process(token) for token in tqdm(l1)]
100%|███████████████████████████████████| 5000/5000 [00:00<00:00, 2326807.94it/s]
No requirement
1/ Use a side function
def report(index):
if index % 1000 == 0:
print(index)
def process(token, index, report=None):
if report:
report(index)
return token['text']
l1 = [{'text': k} for k in range(5000)]
l2 = [process(token, i, report) for i, token in enumerate(l1)]
2/ Use and
and or
statements
def process(token):
return token['text']
l1 = [{'text': k} for k in range(5000)]
l2 = [(i % 1000 == 0 and print(i)) or process(token) for i, token in enumerate(l1)]
3/ Use both
def process(token):
return token['text']
def report(i):
i % 1000 == 0 and print(i)
l1 = [{'text': k} for k in range(5000)]
l2 = [report(i) or process(token) for i, token in enumerate(l1)]
All 3 methods print:
0
1000
2000
3000
4000
How 2 works
i % 1000 == 0 and print(i)
:and
only checks the second statement if the first one isTrue
so only prints wheni % 1000 == 0
or process(token)
:or
always checks both statements, but returns the first one which evals toTrue
.- If
i % 1000 != 0
then the first statement isFalse
andprocess(token)
is added to the list. - Else, then the first statement is
None
(becauseprint
returnsNone
) and likewise, theor
statement addsprocess(token)
to the list
- If
How 3 works
Similarly as 2, because report(i)
does not return
anything, it evals to None
and or
adds process(token)
to the list
Just do:
from time import sleep
from tqdm import tqdm
def foo(i):
sleep(0.01)
return i
[foo(i) for i in tqdm(range(1000))]
For Jupyter notebook:
from tqdm.notebook import tqdm
doc_collection = [[1, 2],
[3, 4],
[5, 6]]
result = [print(progress) or
[str(token) for token in document]
for progress, document in enumerate(doc_collection)]
print(result) # [['1', '2'], ['3', '4'], ['5', '6']]
I don't consider this good or readable code, but the idea is fun.
It works because print
always returns None
so print(progress) or x
will always be x
(by the definition of or
).