Python Iterators: Part Three

Python iteratorsBesides files and physical sequences such as lists, other types also have useful iterators. In older versions of Python, for example, one would step through the keys of a dictionary by requesting the keys list explicitly:

>>> K = {'alpha':1, 'bravo':2, 'charlie':3}
>>> for key in K.keys():
	print(key,K[key])

alpha 1
bravo 2
charlie 3

In more recent versions of Python, however, dictionaries have an iterator that automatically, returns one key at a time in an iteration context:

>>> I = iter(K)
>>> next(I)
'alpha'
>>> next(I)
'bravo'
>>> next(I)
'charlie'
>>> next(I)
Traceback (most recent call last):
  File "<pyshell#88>", line 1, in 
    next(I)
StopIteration

The effect here as that we no longer need to call the keys method to step through dictionary keys. The for loop will use the iteration protocol to grab one key each time through.

Other Python object types also support the iterator protocol and therefore may be used in for loops as well. For example, a shelf is a persistent, dictionary-like object in which the values can be arbitrary Python objects, such as class instances, recursive data types, and objects. Shelves support the iterator protocol. So does os.popen, a tool for reading the output of shell commands:

>>> import os
>>> I = os.popen('dir')
>>> I.__next__()
' Volume in drive C has no label.\n'
>>> I.__next__()
' Volume Serial Number is 9664-E470\n'
>>> next(I)
Traceback (most recent call last):
  File "<pyshell#93>", line 1, in 
    next(I)
TypeError: '_wrap_close' object is not an iterator

Note that the popen objects support a P.next() method in Python 2.6. In 3.0 and later, they support the P.__next__() method, but not the next(P) built-in. It is not clear if this behavior will continue in future releases, but as of Python 3.4.1, it is still the case. This is only an issue for manual iteration; if you iterate over these objects automatically with for loops and other iteration contexts, they return successive lines in either Python version.

The iteration protocol is also the reason we have had to wrap some results in a list call to see their values all at once. Objects that are iterable return results one at a time, not in a physical list:

>>> RG = range(5)
>>> RG 
range(0, 5)
>>> I = iter(RG)
>>> next(I)
0
>>> next(I)
1
>>> list(range(5))
[0, 1, 2, 3, 4]

Now that you have a better understanding of this protocol, you should be able to see how it explains why the enumerate tool introduced in the prior chapter works the way it does:

>>> EN = enumerate('quick')
>>> EN

>>> I = iter(EN)
>>> next(I)
(0, 'q')
>>> next(I)
(1, 'u')
>>> list(enumerate('quick'))
[(0, 'q'), (1, 'u'), (2, 'i'), (3, 'c'), (4, 'k')]

We don’t normally see what is going on under the hood because for loops run it for us automatically to step through results. In face, everything that scans left-to-right in Python employes the iteration protocol.

External Links:

Python Iterators at Python Wiki

Python Iterator tutorial at bogotobogo.com

Python Iterators: Part Two (The Next Function)

Python iteratorsIn the first article in this series, we introduced Python iterators and how they can be used to streamline Python code. In this article, we will continue our look at iterators, beginning with the next function.

To support manual iteration code, Python 3.0 also provides a built-in function, next, that automatically calls an object’s __next__ method. Given an iterable object X, the call next(X) is the same as X.__next__(). With files, for example, either form could be used:

>>> f = open('simple.py')
>>> f.__next__()

>>> f = open('simple.py')
>>> next(f)

Technically, there is one more piece to the iteration protocol. When the for loop begins, it obtains an iterator from the iterable object by passing it to the iter built-in function; the object returned by iter has the required next method. We can illustrate this with the following code:

>>> LS = [1, 2, 3, 4, 5]
>>> myIter = iter(LS)
>>> myIter.next()
1
>>> myIter.next()
2

This initial step is not required for files, because a file object is its own iterator: files have their own __next__ method and so do not need to return a different object that does.

Lists and many other built-in object, are not their own iterators because they support multiple open operations. For such objects, we must call iter to start iterating. For example:

>>> LS = [1, 2, 3, 4, 5]
>>> iter(LS) is LS
False
>>> LS.__next__()
Traceback (most recent call last):
  File "<pyshell#50>", line 1, in 
    LS.__next__()
AttributeError: 'list' object has no attribute '__next__'

>>> myIter = iter(LS)
>>> myIter.__next__()
1
>>> next(myIter)
2

Although Python iteration tools call these functions automatically, we can use them to apply the iteration protocol manually, too. The following demonstrates the equivalence between automatic and manual iteration:

>>> LS = [1, 2, 3, 4, 5]
>>> for X in LS:
	print(X ** 2, end=' ')

1 4 9 16 25

>>> myIter = iter(LS)
>>> while True:
	try:
		X = next(I)
	except StopIteration:
		break
	print(X ** 2, end=' ')

1 4 9 16 25 

To understand this code, you need to know that try statements run an action and catch exceptions (we covered that in the series of articles on exceptions). Also, for loops and other iteration contexts can sometimes work differently for user-defined classes, repeatedly indexing an object instead of running the iteration protocol.

External Links:

Python Iterators at Python Wiki

Python Iterator tutorial at bogotobogo.com