What is the purpose of Python's itertools.repeat?
The primary purpose of itertools.repeat is to supply a stream of constant values to be used with map or zip:
>>> list(map(pow, range(10), repeat(2))) # list of squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
The secondary purpose is that it gives a very fast way to loop a fixed number of times like this:
for _ in itertools.repeat(None, 10000):
do_something()
This is faster than:
for i in range(10000):
do_something().
The former wins because all it needs to do is update the reference count for the existing None object. The latter loses because the range() or xrange() needs to manufacture 10,000 distinct integer objects.
Note, Guido himself uses that fast looping technique in the timeit() module. See the source at https://hg.python.org/cpython/file/2.7/Lib/timeit.py#l195 :
if itertools:
it = itertools.repeat(None, number)
else:
it = [None] * number
gcold = gc.isenabled()
gc.disable()
try:
timing = self.inner(it, self.timer)
The itertools.repeat
function is lazy; it only uses the memory required for one item. On the other hand, the (a,) * n
and [a] * n
idioms create n copies of the object in memory. For five items, the multiplication idiom is probably better, but you might notice a resource problem if you had to repeat something, say, a million times.
Still, it is hard to imagine many static uses for itertools.repeat
. However, the fact that itertools.repeat
is a function allows you to use it in many functional applications. For example, you might have some library function func
which operates on an iterable of input. Sometimes, you might have pre-constructed lists of various items. Other times, you may just want to operate on a uniform list. If the list is big, itertools.repeat
will save you memory.
Finally, repeat
makes possible the so-called "iterator algebra" described in the itertools
documentation. Even the itertools
module itself uses the repeat
function. For example, the following code is given as an equivalent implementation of itertools.izip_longest
(even though the real code is probably written in C). Note the use of repeat
seven lines from the bottom:
class ZipExhausted(Exception):
pass
def izip_longest(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
counter = [len(args) - 1]
def sentinel():
if not counter[0]:
raise ZipExhausted
counter[0] -= 1
yield fillvalue
fillers = repeat(fillvalue)
iterators = [chain(it, sentinel(), fillers) for it in args]
try:
while iterators:
yield tuple(map(next, iterators))
except ZipExhausted:
pass
Your example of foo * 5
looks superficially similar to itertools.repeat(foo, 5)
, but it is actually quite different.
If you write foo * 100000
, the interpreter must create 100,000 copies of foo
before it can give you an answer. It is thus a very expensive and memory-unfriendly operation.
But if you write itertools.repeat(foo, 100000)
, the interpreter can return an iterator that serves the same function, and doesn't need to compute a result until you need it -- say, by using it in a function that wants to know each result in the sequence.
That's the major advantage of iterators: they can defer the computation of a part (or all) of a list until you really need the answer.