Manjusaka

Manjusaka

Let's talk about generators and coroutines in Python.

Preface#

I originally planned to continue writing about Flask this week, but after some thought, I decided to change things up and talk about the very important yet often misunderstood generators and coroutines in Python.

Generators 101#

I guess everyone is somewhat familiar with generators, but to let me continue to show off, let's take a moment to discuss what generators are.
For example, in Python, if we want to generate a list of numbers in the range (1,100000), we might write the following code without much thought:

def generateList(start, stop):
    tempList = []
    for i in range(start, stop):
        tempList.append(i)
    return tempList

Note 1: Some might ask why we don't just return range(start, stop). Nice question! This involves a fundamental issue regarding how range works. This varies by version; in Python 2.x, range(start, stop) essentially generates a list in advance, and the list object is an Iterator, which can be used in a for statement.
range in Python 2.x
In Python 2.x, there is also a statement called xrange, which generates a Generator object.
xrange in Python 2.x
In Python 3, things changed a bit; the community felt that the split between range and xrange was too cumbersome, so they merged them. Now, in Python 3, the syntax for xrange has been removed, and range generates a Generator instead of a list.
range in Python 3

But have you ever considered a problem? If we want to generate a very large amount of data, pre-generating it is undoubtedly unwise, as it would consume a lot of memory. Thus, Python provides us with a new approach: Generator.

def generateList1(start, stop):
    for i in range(start, stop):
        yield i

if __name__ == "__main__":
    c = generateList1(1, 100000)
    for i in c:
        print(i)

Yes, one feature of Generators is that they do not generate data all at once; instead, they generate an iterable object that controls when to start based on the logic we write.

Deep Dive into Generators#

There might be a question: surely Python developers didn't create the Generator mechanism just for this use case, right? So, do we have other scenarios for using Generators? Of course! Look at the title; yes, another significant use of Generators is as coroutines. However, before we delve into that, we need to understand Generators more deeply for our subsequent discussions.

Built-in Methods of Generators#

Background Knowledge on Iterable Objects in Python#

First, let's look at the iteration process in Python.
In Python, there are two concepts in iteration: Iterable and Iterator. Let's examine them separately.
To determine if an Object is Iterable, we check if it implements iter. If it does, we can consider it an Iterable object. Let's look at some code to understand this.

class Counter:
    def __init__(self, low, high):
        self.current = low
        self.high = high

    def __iter__(self):
        return self

    def next(self):  # Python 3: def __next__(self)
        if self.current > self.high:
            raise StopIteration
        else:
            self.current += 1
            return self.current - 1

if __name__ == '__main__':
    a = Counter(3, 8)
    for c in a:
        print(c)

Now, let's see what happens in the above code. First, the for statement checks whether the object being iterated is an Iterable or an Iterator. If it is an object that implements the __iter__ method, it is an Iterable object. The for loop first calls the object's __iter__ method to obtain an Iterator object. So, what is an Iterator object? It can be understood as an object that implements the next() method (Note: in Python 3, it is the next method).

OK, let's return to what we just discussed. In the above code, the for statement first determines whether it is an Iterable or an Iterator. If it is an Iterable, it calls its iter method to get an Iterator object. Then, the for loop calls the next() method of the Iterator object (Note: in Python 3, it is __next__) to iterate until the iteration ends and raises a StopIteration exception.

Now, let's talk about Generators#

Let's look at the previous code again:

def generateList1(start, stop):
    for i in range(start, stop):
        yield i

if __name__ == "__main__":
    c = generateList1(1, 100000)
    for i in generateList1:
        print(i)

First, we need to confirm that Generators are also Iterator objects. OK, let's examine the above code. First, the for statement confirms that generateList1 is an Iterator object, and then it starts calling the next() method for further iteration. At this point, you might wonder how the next() method allows generateList1 to continue iterating. The answer lies in the built-in send() method of Generators. Let's look at another piece of code.

def generateList1(start, stop):
    for i in range(start, stop):
        yield i

if __name__ == "__main__":
    a = generateList1(0, 5)
    for i in range(0, 5):
        print(a.send(None))

What should we output here? The answer is 0, 1, 2, 3, 4, which is the same result we would get using a for loop. Now we can conclude that

The essence of Generator iteration is that it calls the built-in send() method through the built-in next() or __next__() method.

Continuing to Discuss the Built-in Methods of Generators#

Earlier, we mentioned a conclusion:

The essence of Generator iteration is that it calls the built-in send() method through the built-in next() or __next__() method.

Now let's look at an example:

def countdown(n):
    print("Counting down from", n)
    while n >= 0:
        newvalue = (yield n)
        # If a new value got sent in, reset n with it
        if newvalue is not None:
            n = newvalue
        else:
            n -= 1

if __name__ == '__main__':
    c = countdown(5)
    for x in c:
        print(x)
        if x == 5:
            c.send(3)

What should the output of this code be?
The answer is [5, 2, 1, 0]. Confusing, right? Don't worry; let's look at the execution flow of this code.

Code Execution Flow

In short, when we call the send() function, the value we send with send(x) is assigned to newvalue, and the program continues executing until it encounters the next yield, returning the value as the end of the process. Then our Generator quietly sleeps in memory, waiting for the next send to wake it up.

Note 2: Someone asked, "I don't quite understand here; is c.send(3) equivalent to yield n returning a 3 to newvalue?" Good question! We can see from the execution diagram earlier that c.send(3) first assigns 3 to newvalue, then the program runs the remaining code until it encounters the next yield. At this point, before executing yield n, the value of n has already changed to 3, and then yield n is effectively equivalent to return 3. After that, the countdown Generator freezes all variable states and quietly stays in memory, waiting for the next next or __next__() method or send() method to wake it up.

Tip: If we call send() directly, we must use send(None) the first time; only then is the Generator truly activated, allowing us to proceed to the next operation.

Let's Talk About Coroutines#

First, regarding the definition of coroutines, let's look at a piece from Wikipedia:

Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing more familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists, and pipes.
According to Donald Knuth, the term coroutine was coined by Melvin Conway in 1958, after he applied it to the construction of an assembly program. The first published explanation of the coroutine appeared later, in 1963.

In short, coroutines are a lighter model than threads, allowing us to control when to start and stop. In Python, there isn't a dedicated concept for coroutines; the community generally views Generators as a special type of coroutine. Think about it: we can wake our Generator using the next or __next__() method or the send() method, and after executing the code we specified, the Generator returns and freezes all its states. Isn't that exciting?!!

A Little Homework on Generators#

Now we want to perform a post-order traversal of a binary tree. I know that those reading this article can easily write it out, so let's look at the code first:

class Node(object):
    def __init__(self, val, left, right):
        self.val = val
        self.left = left
        self.right = right

def visit_post(node):
    if node.left:
        return visit_post(node.left)
    if node.right:
        return visit_post(node.right)
    return node.val

if __name__ == '__main__':
    node = Node(-1, None, None)
    for val in range(100):
        node = Node(val, None, node)
    print(list(visit_post(node)))

However, we know that if the recursion depth is too deep, we will either run out of stack space or fail the Python transaction. OK, the Generator method is great for keeping you safe; let's look at the code directly:

def visit_post(node):
    if node.left:
        yield node.left
    if node.right:
        yield node.right
    yield node.val

def visit(node, visit_method):
    stack = [visit_method(node)]
    while stack:
        last = stack[-1]
        try:
            yielded = next(last)
        except StopIteration:
            stack.pop()
        else:
            if isinstance(yielded, Node):
                stack.append(visit_method(yielded))
            elif isinstance(yielded, int):
                yield yielded

if __name__ == '__main__':
    node = Node(-1, None, None)
    for val in range(100):
        node = Node(val, None, node)
    visit_generator = visit(node, visit_method=visit_post)
    print(list(visit_generator))

Looks complicated, right? No worries; consider it homework. Feel free to leave comments, and we can discuss Python transactions together!

  1. Improve Your Python: Understanding 'yield' and 'Generators'
  2. The Power of yield
  3. http://my.oschina.net/1123581321/blog/160560
  4. Why Must Python Iterators Implement the iter Method (Regarding iterators, I simplified some things for easier understanding; you can refer to the highly-rated answers to this question for more details)
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.