Python Sets: Part One

Python sets

No, not that kind of set.

One of the core Python data types that we did not mention in earlier articles that perhaps deserves some attention is the set. Sets are a recent addition to Python; they are neither mappings (dictionaries) nor sequences (strings, lists and typles). Sets are created by calling the built-in set function or using new set literals and expressions in 3.0, and they support the usual mathematical set operations.

Introduction to Sets

Creating a set can be done two different ways:

>>> mySet = set(‘hello’)
>>> otherSet = {‘a’,’b’,’c’,’d’,’e’}

len returns the number of unique set items, so we get:
>>> len(otherSet)
5
But if we run the same operation on mySet, we get:
>>> len(mySet)
4

>>> ‘h’ in mySet
True
>>> ‘i’ in mySet
False
>>> mySet.isdisjoint(otherSet)
False
>>> newSet = set(‘he’)
newSet.issubset(mySet)
True

Mathematical set operations are generally valid on sets:

>>> mySet, otherSet
({‘o’, ‘e’, ‘l’, ‘h’}, {‘e’, ‘d’, ‘b’, ‘a’, ‘c’})

>>> mySet & otherSet
{‘e’}

>>> mySet | otherSet
{‘e’, ‘h’, ‘o’, ‘d’, ‘b’, ‘c’, ‘a’, ‘l’}
>>> mySet – otherSet
{‘o’, ‘l’, ‘h’}

There are a few other operations. <= tests whether ever element in the left operand set is in the right operand set. For example: >>> newSet <= mySet
True

But what if we want to return False if the sets are equal? Then we use <:
>> newSet < mySet

True

>>> exactcopy = set(mySet)
>>> exactcopy < mySet

False

>>> exactcopy <= mySet

True

We can flip the operand around, and check to see if the left operand set is a superset of the right operand set:

>>> mySet > newSet
True
>>> mySet >= newSet
True

Sets are mutable; you can add and remove items with the add and remove methods:

>>> otherSet.add(‘f’)
{‘e’, ‘f’, ‘d’, ‘b’, ‘c’, ‘a’}
>>> otherSet.remove(‘c’)
>>> otherSet
{‘e’, ‘f’, ‘d’, ‘b’, ‘a’}

You can also iterate over a set. For example:
>>> for s in mySet:
print(s)

o
e
l
h

Since sets are mutable, there are some things we cannot do with them: for example, we cannot use them as dictionary keys. But objects of type frozenset are immutable. Therefore:

>>> unchangeableSet = frozenset(‘abc’)

Although we can’t add and remove items, we can perform the usual set operations on frozensets, or a combination of sets and frozensets:

>>> mySet | unchangeableSet
{‘o’, ‘e’, ‘b’, ‘h’, ‘c’, ‘a’, ‘l’}

For an example of using frozenset to create keys for a dictionary, here’s a sample:

>>> keySet = frozenset(‘abc’)
>>> names = [ ‘Able’, ‘Baker’, ‘Charlie’ ]
>>> myDict = { }
>>> i = 0
>>> for s in keySet:
myDict[s] = names[i]
i += 1

>>> print(myDict)
{‘a’: ‘Able’, ‘b’: ‘Baker’, ‘c’: ‘Charlie’}

Here, we created an immutable set called keySet, and a list of items to put in our dictionary. We iterate through the set, mapping items in the list to keys in keySet. When we print out the results, we see that each item was successfully mapped to a key.

In the next article, we will continue our look at sets.

External Links:

Set, frozenset at docs.python.org – Official documentation on sets and frozensets for Python 3.4

Python Programming: Part Three (Conditional Statements; Lists, Tuples and Dictionaries)

Python listIn the previous article, we covered variables, and how to save and run Python modules. In this article, we will introduce some more basic concepts including statements that alter the control flow of Python, and lists, tuples and dictionaries.

Up to this point, we have been executing Python statements using a linear control flow. A programming language is of minimal value, however, without conditional statements. Python provides three such statements: if, elif, and else. For example:

>>> if x < 0:
                print(‘x is a negative number’)

In this example, we used the comparison operator “<” (less than) to test to see if x is less than 0. If x is less than zero, we will print a statement that tells us x is a negative number. We might also want to print something out if x is zero or positive:

>>> if x < 0:
               print(‘x is a negative number’)
         elif x == 0:
                print(‘x equals zero’)
         else:
                print(‘x is a positive number’)

Here, the first two lines are the same, but if x is not less than zero, we test it to see if it is equal to zero and if it is, we print a message. If x is not less than zero or zero, we assume it is positive and print another message. Another useful control flow statement is the for statement. The for statement in Python differs a bit from the for statement in C and Pascal. Rather than always iterating over an arithmetic progression of numbers, as in Pascal, or giving the user the ability to define both the iteration step and halting condition as in C, Python’s for statement iterates over the items of any sequence in the order that they appear in the sequence. For example:

>>> word = ‘coffee’
>>> for c in word:
                 print(c)

This will print every letter in the string word on a separate line. If you need to iterate over a sequence of numbers, the built in function range() comes in handy. It generates arithmetic progressions:

>>> for i in range(10):
                 print(i,’ ‘,i**2)

This code will print out two columns: one containing integers 0 through 9, the other containing their squares. If you specify only a single parameter, 0 will be the lower bound and the specified parameter will be the upper bound. But you can specify a lower bound:

>>> for i in range(5,10):
                print(i,’ ‘,i**2)

This code will print out integers 5 through 9 and their squares. We can also specify a step value:

>>> for i in range(1,10,2):
                 print(i,’ ‘,i**2)

This code will print out all odd-numbered integers from 1 through 9 and their squares.

Lists, Tuples and Dictionaries

In C, there are arrays, which act as a compound data type. In Python, there are two sequence types not yet covered: lists and tuples. There are also dictionaries, which is not a sequence type, but is nonetheless very useful.

Lists are collections of some data type. Creating a list can be done quite easily:

>>> mylist = [5, 10, 15]

This will create a list with three items in it. We can iterate through the list as well:

>>> for i in mylist:
                 print(i)

This will print each item in mylist on a separate line. You can use the assignment operator on a list:

>>> otherlist = mylist

Now we will be able to access the list, but it is important to note that otherlist is not a separate copy of mylist. It also points to mylist. So this statement:

>>> otherlist[0] = 100

will alter the contents of mylist:

>>> print(mylist)
[100, 10, 15]

You can also nest a list within a list. For example:

>>> secondlist = [1, 2, 3, 4]
>>> mylist[0] = secondlist

nests secondlist inside mylist as the first element.

You can also create an empty list, by specifying an empty set of brackets in the assignment, like this:

mylist = []

Lists have a built-in method called append that can be used to append an item to the end of a list:

>>> mylist.append(125)

appends 125 to the end of the list.

Whereas a list is a mutable sequence data type, a tuple is an immutable sequence data type. A tuple consists of a number of values separated by commas. For example:

>>> tu = 1, 2, 3

A tuple need not contain all the same data type. The following statement is valid:

>>> tu = 5, 10, 15, ‘twenty’

To confirm that tuples are immutable, we can try to change one of the items:

>>> tu[0] = 121
Traceback (most recent call last):
File “”, line 1, in
TypeError: ‘tuple’ object does not support item assignment

On output, tuples are always enclosed in parentheses, so that nested tuples are interpreted correctly; they may be input with or without surrounding parenthesis, although often parenthesis are necessary anyway if the tuple is part of a larger expression. It is not possible to make an assignment to an individual item of a tuple; however, it is possible to create tuples which contain mutable objects, such as lists.

Another useful data type built into Python is the dictionary. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type. Strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples. If a tuple contains any mutable object either directly or indirectly, it cannot be used as a key. You cannot use lists as keys, since lists can be modified.

You can think of dictionaries as unordered sets of key: value pairs, with the requirement that the keys are unique within one dictionary. A pair of braces creates an empty dictionary. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary.

Here is an example of a dictionary:

>>> generals = { ‘Germany’: ‘Hindenburg’, ‘Russia’: ‘Samsonov’, ‘France’: ‘Nivelle’, ‘United Kingdom’: ‘Haig’ }
>>> print(generals[‘Germany’])
‘Hindenburg’
>>> del generals[‘Russia’]
>>> generals[‘France’] = ‘Foch’
>>> print(generals)
{‘Germany’: ‘Hindenburg’, ‘France’: ‘Foch’, ‘United Kingdom’: ‘Haig’}

You can use for to iterate through a dictionary. For example:

>>> for c in generals.keys()
print(generals[c])

Hindenburg
Foch
Haig

In the next article, we will introduce the concept of modules, and write our first function.

External Links:

Python documentation from the official Python website