Python Sets: Part Two

Python setsIn part one of this series, we took a first look at sets, a relatively new core data type in Python. In this article, we will continue cover set methods, as well as a new literal format for sets.

In addition to expressions, the set object provides methods that correspond to the set operations. For example, the set add method inserts one item; update provides an in-place union, and remove deletes an item by value. Run a dir call on any set instance or the set type name to see all the available methods. Assuming we have mySet and otherSet from the previous article:

>>> mySet = set(‘hello’)
>>> otherSet = set([‘a’,’b’,’c’,’d’,’e’])

then we can perform these operations:

>>> si = mySet.intersection(otherSet)
>>> si
{‘e’}
si.add(‘MORE’)
si
{‘MORE’, ‘e’}
>>> si.update(set([‘X’,’Y’]))
>>> si
{‘X’, ‘MORE’, ‘e’, ‘Y’}
>>> si.remove(‘e’)
si
{‘X’, ‘MORE’, ‘Y’ }

As iterable containers, sets can also be used in operations such as len, for loops, and list comprehensions. Because they are unordered, however, they do not support sequence operations like indexing and slicing.

>>> for item in set('aeiou'):
        print(item * 4)

uuuu
iiii
oooo
aaaa
eeee

Although these set expressions generally require two sets, they often work with any iterable types, such as lists, tuples, and ranges. For example:

mySet.union([‘a’,’b’,’c’])
{‘b’, ‘l’, ‘c’, ‘o’, ‘a’, ‘e’, ‘h’}
mySet.intersection((‘e’,’l’,’p’))
{‘e’, ‘l’}

Using Curly Braces

Python 3.0 adds a new set literal form, using the curly braces formerly reserved for dictionaries. In 3.0 and above, the following are equivalent:

set([‘a’,’b’,’c’])
{‘a’, ‘b’, ‘c’}

Because sets are unordered, unique, and immutable, a set’s items behave much like a dictionary’s keys, and therefore the new syntax makes sense. Regardless of how a set is made, 3.0 and above displays sets using the new literal format. The set built-in is still required in 3.0 to create empty sets and to build sets from existing iterable objects, but the new literal format is convenient for initializing sets of known structure:

>>> {‘a’,’b’,’c’}
{‘c’, ‘b’, ‘a’}

>>> S = {‘a’,’e’,’i’,’o’,’u’}
>>> S.add(‘MORE’)
>>> S
{‘MORE’, ‘i’, ‘o’, ‘a’, ‘e’, ‘u’}

All the set processing operations discussed in the previous article work the same in 3.0 and above, but the result sets print differently:

>>> S1 = {‘a’,’b’,’c’}
>>> S1 & {‘a’,’c’}
{‘c’,’a’}
>>> {‘a’,’e’,’i’,’o’,’u’} | S1
{‘b’, ‘i’, ‘o’, ‘c’, ‘a’, ‘e’, ‘u’}
>>> S1 – {‘a’,’b’}
{‘c’}
>>> S1 > {‘a’,’c’}
True

Note that {} is still a dictionary in Python. Empty sets must be created with the set built-in, and print the same way:

>>> S1 – {‘a’,’b’,’c’}
set()
>>> type({})
<class ‘dict’>

>>> mySet = set()
>>> mySet.add(‘a’)
>>> mySet
{‘a’}

As in Python 2.6, sets created with 3.0 literals support the same methods, some of which allow general iterable operands that expressions do not:

>>> {‘a’,’b’,’c’} | {‘c’,’d’}
{‘c’, ‘b’, ‘a’, ‘d’}
>>> {‘a’,’b’,’c’} | [‘c’,’d’]
Traceback (most recent call last):
File “<pyshell#34>”, line 1, in
{‘a’,’b’,’c’} | [‘c’,’d’]
TypeError: unsupported operand type(s) for |: ‘set’ and ‘list’

>>> {‘a’,’b’,’c’}.union([‘c’,’d’])
{‘c’, ‘b’, ‘a’, ‘d’}
>>> {‘a’,’b’,’c’}.union([‘c’,’d’])
{‘c’, ‘b’, ‘a’, ‘d’}
>>> {‘a’,’b’,’c’}.union(set([‘c’,’d’]))
{‘c’, ‘b’, ‘a’, ‘d’}
>>> {‘a’,’b’,’c’}.intersection((‘c’,’e’,’f’))
{‘c’}
>>> {2,3,4}.issubset(range(-5,5))
True
>>>

External Links:

Set, frozenset at docs.python.org – Official documentation on sets and frozensets for Python 3.4