__next__ in generators and iterators and what is a method-wrapper?
The special methods __iter__
and __next__
are part of the iterator protocol to create iterator types. For this purpose, you have to differentiate between two separate things: Iterables and iterators.
Iterables are things that can be iterated, usually, these are some kind of container elements that contain items. Common examples are lists, tuples, or dictionaries.
In order to iterate an iterable, you use an iterator. An iterator is the object that helps you iterate through the container. For example, when iterating a list, the iterator essentially keeps track of which index you are currently at.
To get an iterator, the __iter__
method is called on the iterable. This is like a factory method that returns a new iterator for this specific iterable. A type having a __iter__
method defined, turns it into an iterable.
The iterator generally needs a single method, __next__
, which returns the next item for the iteration. In addition, to make the protocol easier to use, every iterator should also be an iterable, returning itself in the __iter__
method.
As a quick example, this would be a possible iterator implementation for a list:
class ListIterator:
def __init__ (self, lst):
self.lst = lst
self.idx = 0
def __iter__ (self):
return self
def __next__ (self):
try:
item = self.lst[self.idx]
except IndexError:
raise StopIteration()
self.idx += 1
return item
The list implementation could then simply return ListIterator(self)
from the __iter__
method. Of course, the actual implementation for lists is done in C, so this looks a bit different. But the idea is the same.
Iterators are used invisibly in various places in Python. For example a for
loop:
for item in lst:
print(item)
This is kind of the same to the following:
lst_iterator = iter(lst) # this just calls `lst.__iter__()`
while True:
try:
item = next(lst_iterator) # lst_iterator.__next__()
except StopIteration:
break
else:
print(item)
So the for loop requests an iterator from the iterable object, and then calls __next__
on that iterable until it hits the StopIteration
exception. That this happens under the surface is also the reason why you would want iterators to implement the __iter__
as well: Otherwise you could never loop over an iterator.
As for generators, what people usually refer to is actually a generator function, i.e. some function definition that has yield
statements. Once you call that generator function, you get back a generator. A generator is esentially just an iterator, albeit a fancy one (since it does more than move through a container). As an iterator, it has a __next__
method to “generate” the next element, and a __iter__
method to return itself.
An example generator function would be the following:
def exampleGenerator():
yield 1
print('After 1')
yield 2
print('After 2')
The function body containing a yield
statement turns this into a generator function. That means that when you call exampleGenerator()
you get back a generator object. Generator objects implement the iterator protocol, so we can call __next__
on it (or use the the next()
function as above):
>>> x = exampleGenerator()
>>> next(x)
1
>>> next(x)
After 1
2
>>> next(x)
After 2
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
next(x)
StopIteration
Note that the first next()
call did not print anything yet. This is the special thing about generators: They are lazy and only evaluate as much as necessary to get the next item from the iterable. Only with the second next()
call, we get the first printed line from the function body. And we need another next()
call to exhaust the iterable (since there’s not another value yielded).
But apart from that laziness, generators just act like iterables. You even get a StopIteration
exception at the end, which allows generators (and generator functions) to be used as for
loop sources and wherever “normal” iterables can be used.
The big benefit of generators and their laziness is the ability to generate stuff on demand. A nice analogy for this is endless scrolling on websites: You can scroll down item after after (calling next()
on the generator), and every once in a while, the website will have to query a backend to retrieve more items for you to scroll through. Ideally, this happens without you noticing. And that’s exactly what a generator does. It even allows for things like this:
def counter():
x = 0
while True:
x += 1
yield x
Non-lazy, this would be impossible to compute since this is an infinite loop. But lazily, as a generator, it’s possible to consume this iterative one item after an item. I originally wanted to spare you from implementing this generator as a fully custom iterator type, but in this case, this actually isn’t too difficult, so here it goes:
class CounterGenerator:
def __init__ (self):
self.x = 0
def __iter__ (self):
return self
def __next__ (self):
self.x += 1
return self.x
Why is
__next__
only available to list but only to__iter__()
andmygen
but notmylist
. How does__iter__()
call__next__
when we are stepping through the list using list-comprehension.
Because lists have a separate object that is returned from iter
to handle iteration, this objects __iter__
is consecutively called.
So, for lists:
iter(l) is l # False, returns <list-iterator object at..>
While, for generators:
iter(g) is g # True, its the same object
In looping constructs, iter
is first going to get called on the target object to be looped over. iter
calls __iter__
and an iterator is expected to be returned; its __next__
is called until no more elements are available.
What is a method-wrapper and what does it do? How is it applied here: in
mygen()
and__iter__()
?
A method wrapper is, if I'm not mistaken, a method implemented in C
. Which is what both these iter(list).__iter__
(list
is an object implemented in C
) and gen.__iter__
(not sure here but generators are probably too) are.
If
__next__
is what both generator and iterator provide (and their sole properties) then what is the difference between generator and iterator?
A generator is an iterator, as is the iterator provided from iter(l)
. It is an iterator since it provides a __next__
method (which, usually, when used in a for loop it is capable of providing values until exhausted).
__next__
and __iter__
are method wrappers for when you do next(some_gen)
or iter(some_sequence)
. next(some_gen)
is the same as some_gen.__next__()
So if I do mygen = iter(mylist)
then mygen
is mylist
implemented as a generator object and has a __next__
method descriptor. Lists themselves do not have this method because they are not generators.
Generators are iterators. Check out difference between generators and iterators