- Source: itsCoder's WeeklyBolg project
- itsCoder homepage: http://itscoder.com/
- Author: Manjusaka
- Reviewers: allenwu, Brucezz
Preface#
I originally planned to continue writing about Flask this week, but after some thought, I decided to change things up and talk about the very important yet often misunderstood generators and coroutines in Python.
Generators 101#
I guess everyone is somewhat familiar with generators, but to let me continue to show off, let's take a moment to discuss what generators are.
For example, in Python, if we want to generate a list of numbers in the range (1,100000), we might write the following code without much thought:
def generateList(start, stop):
tempList = []
for i in range(start, stop):
tempList.append(i)
return tempList
Note 1: Some might ask why we don't just return
range(start, stop)
. Nice question! This involves a fundamental issue regarding howrange
works. This varies by version; in Python 2.x,range(start, stop)
essentially generates alist
in advance, and thelist
object is an Iterator, which can be used in afor
statement.
In Python 2.x, there is also a statement calledxrange
, which generates a Generator object.
In Python 3, things changed a bit; the community felt that the split betweenrange
andxrange
was too cumbersome, so they merged them. Now, in Python 3, the syntax forxrange
has been removed, andrange
generates a Generator instead of alist
.
But have you ever considered a problem? If we want to generate a very large amount of data, pre-generating it is undoubtedly unwise, as it would consume a lot of memory. Thus, Python provides us with a new approach: Generator.
def generateList1(start, stop):
for i in range(start, stop):
yield i
if __name__ == "__main__":
c = generateList1(1, 100000)
for i in c:
print(i)
Yes, one feature of Generators is that they do not generate data all at once; instead, they generate an iterable object that controls when to start based on the logic we write.
Deep Dive into Generators#
There might be a question: surely Python developers didn't create the Generator mechanism just for this use case, right? So, do we have other scenarios for using Generators? Of course! Look at the title; yes, another significant use of Generators is as coroutines. However, before we delve into that, we need to understand Generators more deeply for our subsequent discussions.
Built-in Methods of Generators#
Background Knowledge on Iterable Objects in Python#
First, let's look at the iteration process in Python.
In Python, there are two concepts in iteration: Iterable and Iterator. Let's examine them separately.
To determine if an Object is Iterable, we check if it implements iter. If it does, we can consider it an Iterable object. Let's look at some code to understand this.
class Counter:
def __init__(self, low, high):
self.current = low
self.high = high
def __iter__(self):
return self
def next(self): # Python 3: def __next__(self)
if self.current > self.high:
raise StopIteration
else:
self.current += 1
return self.current - 1
if __name__ == '__main__':
a = Counter(3, 8)
for c in a:
print(c)
Now, let's see what happens in the above code. First, the for
statement checks whether the object being iterated is an Iterable or an Iterator. If it is an object that implements the __iter__
method, it is an Iterable object. The for
loop first calls the object's __iter__
method to obtain an Iterator object. So, what is an Iterator object? It can be understood as an object that implements the next() method (Note: in Python 3, it is the next method).
OK, let's return to what we just discussed. In the above code, the for
statement first determines whether it is an Iterable or an Iterator. If it is an Iterable, it calls its iter method to get an Iterator object. Then, the for
loop calls the next()
method of the Iterator object (Note: in Python 3, it is __next__
) to iterate until the iteration ends and raises a StopIteration
exception.
Now, let's talk about Generators#
Let's look at the previous code again:
def generateList1(start, stop):
for i in range(start, stop):
yield i
if __name__ == "__main__":
c = generateList1(1, 100000)
for i in generateList1:
print(i)
First, we need to confirm that Generators are also Iterator objects. OK, let's examine the above code. First, the for
statement confirms that generateList1
is an Iterator object, and then it starts calling the next()
method for further iteration. At this point, you might wonder how the next()
method allows generateList1
to continue iterating. The answer lies in the built-in send()
method of Generators. Let's look at another piece of code.
def generateList1(start, stop):
for i in range(start, stop):
yield i
if __name__ == "__main__":
a = generateList1(0, 5)
for i in range(0, 5):
print(a.send(None))
What should we output here? The answer is 0, 1, 2, 3, 4
, which is the same result we would get using a for
loop. Now we can conclude that
The essence of Generator iteration is that it calls the built-in
send()
method through the built-innext()
or__next__()
method.
Continuing to Discuss the Built-in Methods of Generators#
Earlier, we mentioned a conclusion:
The essence of Generator iteration is that it calls the built-in
send()
method through the built-innext()
or__next__()
method.
Now let's look at an example:
def countdown(n):
print("Counting down from", n)
while n >= 0:
newvalue = (yield n)
# If a new value got sent in, reset n with it
if newvalue is not None:
n = newvalue
else:
n -= 1
if __name__ == '__main__':
c = countdown(5)
for x in c:
print(x)
if x == 5:
c.send(3)
What should the output of this code be?
The answer is [5, 2, 1, 0]
. Confusing, right? Don't worry; let's look at the execution flow of this code.
In short, when we call the send()
function, the value we send with send(x)
is assigned to newvalue
, and the program continues executing until it encounters the next yield
, returning the value as the end of the process. Then our Generator quietly sleeps in memory, waiting for the next send
to wake it up.
Note 2: Someone asked, "I don't quite understand here; is
c.send(3)
equivalent toyield n
returning a3
tonewvalue
?" Good question! We can see from the execution diagram earlier thatc.send(3)
first assigns3
tonewvalue
, then the program runs the remaining code until it encounters the nextyield
. At this point, before executingyield n
, the value ofn
has already changed to3
, and thenyield n
is effectively equivalent toreturn 3
. After that, thecountdown
Generator freezes all variable states and quietly stays in memory, waiting for the nextnext
or__next__()
method orsend()
method to wake it up.
Tip: If we call
send()
directly, we must usesend(None)
the first time; only then is the Generator truly activated, allowing us to proceed to the next operation.
Let's Talk About Coroutines#
First, regarding the definition of coroutines, let's look at a piece from Wikipedia:
Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing more familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists, and pipes.
According to Donald Knuth, the term coroutine was coined by Melvin Conway in 1958, after he applied it to the construction of an assembly program. The first published explanation of the coroutine appeared later, in 1963.
In short, coroutines are a lighter model than threads, allowing us to control when to start and stop. In Python, there isn't a dedicated concept for coroutines; the community generally views Generators as a special type of coroutine. Think about it: we can wake our Generator using the next
or __next__()
method or the send()
method, and after executing the code we specified, the Generator returns and freezes all its states. Isn't that exciting?!!
A Little Homework on Generators#
Now we want to perform a post-order traversal of a binary tree. I know that those reading this article can easily write it out, so let's look at the code first:
class Node(object):
def __init__(self, val, left, right):
self.val = val
self.left = left
self.right = right
def visit_post(node):
if node.left:
return visit_post(node.left)
if node.right:
return visit_post(node.right)
return node.val
if __name__ == '__main__':
node = Node(-1, None, None)
for val in range(100):
node = Node(val, None, node)
print(list(visit_post(node)))
However, we know that if the recursion depth is too deep, we will either run out of stack space or fail the Python transaction. OK, the Generator method is great for keeping you safe; let's look at the code directly:
def visit_post(node):
if node.left:
yield node.left
if node.right:
yield node.right
yield node.val
def visit(node, visit_method):
stack = [visit_method(node)]
while stack:
last = stack[-1]
try:
yielded = next(last)
except StopIteration:
stack.pop()
else:
if isinstance(yielded, Node):
stack.append(visit_method(yielded))
elif isinstance(yielded, int):
yield yielded
if __name__ == '__main__':
node = Node(-1, None, None)
for val in range(100):
node = Node(val, None, node)
visit_generator = visit(node, visit_method=visit_post)
print(list(visit_generator))
Looks complicated, right? No worries; consider it homework. Feel free to leave comments, and we can discuss Python transactions together!
Reference Links#
- Improve Your Python: Understanding 'yield' and 'Generators'
- The Power of yield
- http://my.oschina.net/1123581321/blog/160560
- Why Must Python Iterators Implement the iter Method (Regarding iterators, I simplified some things for easier understanding; you can refer to the highly-rated answers to this question for more details)