Python Debugging: An Overview

debugging

The Eclipse debugger in action.

At some point, one of your programs is likely to do something you did not expect it to do, and in such a scenario, you are going to have to debug your code. Fortunately, whether you are using the command line Python interpreter, the IDLE interface, or a third-party IDE, there are multiple options for debugging.

Debugging Options

One option is to rely on the error messages Python provides. If you already know Python, and especially if you understand your own code, this is often enough: you just read the error message and go and fix the tagged line and file. It may not always be a good option, however, for larger programs you did not write.

Another possibility, and probably the main way that Python programmers use for debugging their code is to insert print statements and run the program again. Because Python runs immediately after changes, this is usually the quickest way to get more information than error messages provide. Typically, a display of variable values is enough to provide the information you need.

For larger systems you did not write, and for beginners who want to trace code in more detail, most Python development GUIs have some sort of point-and-click debugging support. IDLE has a debugger as well, but it does not appear to be used very often (possibly because it has no command line, or possibly because adding print statements is usually quicker than setting up an IDLE GUI debugging session. But other IDEs such an Eclipse, NetBeans, Komodo and Wing IDE offer advanced point-and-click debuggers.

Python comes with a source-code debugger named pdb, available as a module in Python’s standard library. In pdb, you type commands to step line by line, display variables, set and clear breakpoints, or continue to a breakpoint or error. It can be launched interactively by importing it, or as a top-level script. Either way, it provides a powerful debugging tool.

To provide a real-world example of debugging to solve a coding problem, I introduced a bug into the binary sort code. The code is provided here; some readers may be able to find the bug just by looking at the code. In any case, when we run the code, we get the following result:

[None, None, 17, 144, 372, 51, 79, 123, 519, 997]

This result is wrong for at least two reasons. First, two list members have been replaced with “None”. Secondly, the list is not sorted. Fortunately, this gives us an opportunity to use our debugging skills. The heapSort function does the following:

  1. Creates a binary heap.
  2. While there is at least one item on the heap, delete the top item from the heap. If there is remaining heap data, rebuild the heap, consolidate the heap, and repeat this step.

A cursory review of the code seems to indicate that we correctly deleted the top item and re-inserted it after the heap. The error must be in heapify, rebuildHeap, or consolidateHeap. For starters, I decided to set a breakpoint after heapify is called and check to see if it returns a valid binary heap. If it does, we can be pretty sure the error is in rebuildHeap or consolidateHeap.

In Eclipse, I set a breakpoint in the second line of heapSort by right mouse-clicking on the left column next to that line of code (the column containing numbers indicating what line of code it is) and selecting “Add Breakpoint). Now I was able to debug the code by pressing “F11”. Eclipse now presents a dialog box asking if I want to switch to the debug perspective. Since I do, I press “Yes” and continue.

In the upper right corner of the Debug perspective, there is a section with two tabs: “Variables” and “Breakpoints”. The “Variables” tab provides us the information about local variables, including the current values stored in L. We can confirm that the list items form a valid binary heap:

        997
      /     \
    519     239
   /  \     / \
  144  372 17 29
  / \   /
79  51 123

It does not appear that the problem is in heapify, so it must be in either rebuildHeap or consolidateHeap. To find the bug, we will first step through the heapSort function line by line, by pressing F6.

As soon as we press F6 the first time, the variable high shows up on the variable list, since it has now made its first appearance in the code. The same goes for temp, when it makes its first appearance two lines later. The next line deletes the top node, and the change is reflected in the values for L shown on the “Variables” tab. All is well so far.

But after we run rebuildHeap for the first time, it is a different story. Our “heap” looks like this:

          519
         /   \
      None    29
      /  \    / \
    144  372 17 None
    / \   /
   79 51 123

It looks like rebuildHeap has failed us. Nodes marked “None” should not have children. The next step will be to remove the breakpoint from heapSort and put a breakpoint in rebuildHeap. Then we will restart the program. We terminate the program by pressing “CTRL-F2” and then pressing “F11” again.

Once we do this, we can continue debugging and begin stepping through the source code again. Looking at the “Variables” tab, we can see that lnode (left node) is initially set to 1, the correct value for the left node of the root node. The next line checks to see if there is at least one heap item and a left node. There is, so we proceed to the next line. We have to promote either the left node or the right node. We promote the left node if it is greater than the right node or if there is no right node. If so, we assign the left node’s value to the root node, mark the left node as empty, and run rebuildHeap recursively on the left node. When we reach the recursive rebuildHeap call, we will use “F5” to step into the second copy of rebuildHeap.

Once we start stepping into the second rebuildHeap, we can see something is wrong. lnode is now 5, but as the lnode of the lnode of the root, it should be 3. This is because root is equal to 2 when it should be 1. Looking at the first recursive rebuildHeap call in the function, we see that the second parameter is lnode+1 (which calls the function recursively on the right node), when it should be lnode (to call the function recursively on the left node). If we keep stepping through the code, rebuildHeap will perform operations on the wrong nodes and tamper with the integrity of heap data that should be kept intact. Here, it will promote one of the children of the right node, and erase data, as we can see by viewing list L’s values just before the next rebuildHeap call:

[519, None, 29, 144, 372, 17, None, 79, 51, 123]

In other words, it looks this is what accounts for the error in rebuildHeap. We could continue to step through the code, but to save time, let’s change the second rebuildHeap parameter at line 56 to lnode, remove the breakpoint, and see what happens.

Running the program after that modification results in the following output:

[17, 29, 51, 79, 123, 144, 239, 372, 519, 997]

Thus, it seems as if we have isolated and fixed the bug that was causing the error we were seeing earlier. This does not guarantee that our code is bug-free, but if there are other bugs to be found, we can have some confidence that the methods used here can be applied to finding them.

In this example, we used the Eclipse debugger for debugging, but don’t let this discourage you from using whatever tools you want. The Winpdb system, for example, is an open source stand-alone debugger with advanced debugging support and cross-platform GUI and console interfaces. These options become more important as you start writing larger scripts, but at least Python’s debugging support makes the process of finding errors a lot easier than they might be otherwise.

External Links:

The official Eclipse web site – An excellent IDE with good debugging capabilities

The official Winpdb web site

Classes and Inheritance: Part Two

classesIn the previous article, we introduced Python classes, discussed some of their features, and some of the similarities and differences from C++ classes. In this article, we will use our knowledge of C++ classes to rewrite our hash functions as a Python class.

Writing the HashObject Class

Our first job is to write the class constructor. Since it makes sense to initialize our list when the class is first instantiated, I decided to incorporate the createBuckets() function as part of the constructor:

class HashObject(object):
    def __init__(self, num):
        self.numBuckets = num
        self.hashSet = []
        for i in range(num):
            self.hashSet.append([])

This constructor takes one parameter (num), and assigns it to numBuckets, and then initializes our list.

Next, we want to define hashElement, which takes a number and hashes it to an index in hashSet. This function is the same as before, except that we do not have to pass the number of buckets to it:

def hashElement(self, elem):
        if type(elem) == int:
            return elem%self.numBuckets

The new insert function requires only one parameter: the list item to be inserted into one of the buckets:

def insert(self, i):
        self.hashSet[self.hashElement(i[0])].append(i)

Note that it is assumed that the first element of the list to be stored is assumed to be the key. Next, we need a remove function. We will take one parameter: the key value of the item to be deleted:

def remove(self, key):
        newElement = []
        for j in self.hashSet[self.hashElement(key)]:
            if j[0] != key:
                newElement.append(j)
        self.hashSet[self.hashElement(key)] = newElement    

Finally, we need to write the function to check to see if an item with a certain key exists:

def member(self, key):
        return i in self.hashSet[hashElement(key)]

One additional member function we may want to define is an accessor function for hashSet so we do not have to directly access the list, a good practice, since data hiding is one of the objectives of object-oriented programming:

def getElement(self, key):
        for j in self.hashSet[self.hashElement(key)]:
            if j[0] == key:
                return j

As you can see, we iterate through each list item in the bucket the key value hashes to (multiple items may be stored in a single bucket). When we find an item whose first value equals the key, we return that item.

With variables and member functions for HashObject defined, we can rewrite hashTest to test our class:

def hashTest():
    h = HashObject(51)
    h.insert([2175,'Homer Simpson','Technician',20])
    print(h.getElement(2175))
    h.insert([4158,'Elmer Higgins','Engineer',17])
    print(h.getElement(4158))
    h.insert([2583,'Waylon Smithers','Assistant',25])
    print(h.getElement(2583))
    h.remove(2175)
    print(h.getElement(2175))
    print(h.getElement(2583))

Running this function results in the following output:

[2175, ‘Homer Simpson’, ‘Technician’, 20]
[4158, ‘Elmer Higgins’, ‘Engineer’, 17]
[2583, ‘Waylon Smithers’, ‘Assistant’, 25]
None
[2583, ‘Waylon Smithers’, ‘Assistant’, 25]

In this function, we first create a hash set with 51 buckets. We insert the first record (Homer Simpson’s), and print it out, confirming its successful insertion. We insert a second record, also printing it out to confirm its insertion. Then we enter a third item (which, incidentally, hashes to the same location as the first item), and print it out, confirming its insertion and also confirming that we can successfully iterate through a bucket. Finally, we remove the first record and try to print it out. Since the getElement function will not return a value for a nonexistent record, the print call prints out “None”, for no value. Finally, we successfully print out the third record.

I have made HashObject and HashTest available for download here, so you can download it, run it and modify it to your heart’s content.

External Links:

Classes Tutorial on python.org

Class (computer programming) on Wikipedia

Classes and Inheritance: Part One

classesIn the previous article, we introduced some hashing functions, but had to pass the number of buckets (numBuckets) to each function, which was not an ideal solution. In this article, we introduce a programming construct that will provide a better way of working with these hashing functions: classes.

Classes Defined

A class is an extensible program-code-template for creating objects, providing initial values for member variables and implementations of functions. When an object is created by a constructor of the class, the resulting object is called an instance of the class, and the member variables specific to the object are called instance variables (unlike class variables, which are variables that are shared across the class).

Classes are essentially combinations of variables and functions. We use the word “encapsulation” to refer to this combination. The variables either belong to the class or specific instances of the class. The functions are subroutines with the ability to operate on objects or classes. Many kinds of functions exist. In most programming language, there are constructors (functions automatically called when an instance of the class is created), destructors (functions automatically called when an instance of the class is deleted or otherwise destroyed), and conversion operators (functions that define how certain operators, e.g. the equality operator, work with instances of the class).

Classes: Implementation in Python

In Python, the simplest form of a class definition looks like this:

class HashObject:
	
	...
	

Class definition must precede any instantiation of the class to have any effect. The statements inside a class definition will usually be function definitions, but other statements are allowed. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling convention for functions.

Class objects support attribute references and instantiation. Attribute reference use the standard syntax used for all attribute references in Python: obj.name. If the class definition looked like this:

class HashObject:
	i = 123
	def myfunc(self)
		return 'A simple function definition'

then HashObject.i and HashObject.myfunc are valid attribute references. Class attribute can be assigned, so you can change the value of HashObject.i by assignment.

Class instantiation uses function notation; just pretend that the class object is a parameterless function that returns a new instance of the class. For example,

h = HashObject()

creates a new instance of the class HashObject and assigns this object to the local variable h.

The instantiation operation creates an empty object. Many classes like to create objects with instances customized to a specific initial state. Therefore, a class may define a special method, called a constructor. Here is an example:

def __init__(self):
	self.data = []

When a class defines an __init__() method, class instantiation automatically invokes __init__() for the newly-created class instance. In the case of HashObject, a new, initialized instance can be obtained by:

h = HashObject()

The __init__() method may have arguments for greater flexibility. In that case, arguments given to the class instantiation operator are passed on to init. For example:

class HashObject:
   def __init__(self, num):
      self.numBuckets = num

>>> h = HashObject(51)
>>> x.numBuckets
51

With instance objects, there are two kinds of valid attribute names, data attributes and methods. Data attributes correspond to data members in C++. Like local variables, they do not need to be declared and spring into existence when they first have a value assigned to them. For example, if we created an instance of HashObject called h, we can create and print out the value of x like so:
>>> h.x = 123
>>> print(‘h.x = ‘,h.x)
123
The other kind of instance attribute is a method. A method is a function that belongs to an object. Valid method names of an instance object depend on its class. By definition, all attributes of a class that are function objects define corresponding methods of its instances.

Usually, a method is called right after it is bound. Suppose we add a function definition to HashObject:

class HashObject:
    def test(self):
        print('Hello, world!')

We can call the method in object h like so:
>>> h.test()
Hello, world!

Notice that h.test() was called without an argument, even though the function definition for test() specifies an argument. This is where Python deviates from C++. In this example, the call h.test() is exactly equivalent to HashObject.test(h). In general calling a method with a list of n arguments is the same as calling the corresponding function with an argument list that is created by inserting the method’s object before the first argument.

Python classes also supports inheritance. The syntax for a derived class definition looks like this:

class DerivedClass(BaseClass):
	
	...
	

When the class object is constructed, the base class is remembered. If a requested attribute is not found in the class, the search proceeds to look in the base class. This rule is applied recursively if the base class itself is derived from some other class.

If we create an instance of DerivedClass, method references are resolved as follows: the corresponding class attribute is searched, descending down the chain of base classes if necessary, and the method reference is valid if this process yields a function object. Derived classes may also override methods of their base classes. Because methods have no special privileges when calling other methods of the same object, a method of a base class that calls another method defined in the same base class may end up calling a method of a derived class that overrides it. This is similar to virtual functions in C++, where a function is left undefined in the base class but defined in one or more derived classes.

Python also supports a form of multiple inheritance as well. A class definition with multiple base classes looks like this:

class DerivedClass(BaseClass1, BaseClass2, BaseClass3)
	
	...
	

The search for attributes is depth-first, left to right, so if an attribute is not found in the derived class, it is searched for in BaseClass1, then the base classes of BaseClass1, then BaseClass2, and so on. It should be mentioned that the multiple inheritance method resolution in Python uses a dynamic algorithm to linearize the search order, both preserving the search order, and ensuring that each parent is called only once.

One significant way in which Python classes differ from C++ classes is that there is no means of declaring functions or variables to be private or protected. Thus, there are no private variables in Python. The closest you’ll get are variables that start with a double underscore, like __variable. Such a variable cannot be directly referenced outside the class definition, at least not directly. Thus, if we have something like this:

class HashObject:
	def __init__(self, num):
		__numBuckets = num

and we create an instance of HashObject:

>>> h = HashObject(47)

We can’t access __numBuckets directly:

>>> print(h.__numBuckets)

will result in an error message.

Now that we have a solid foundation for understanding Python classes, we should be able to rewrite our hash functions from the previous article as a class. We will do that in the next article.

External Links:

Classes Tutorial on python.org

Class (computer programming) on Wikipedia

Hash Functions in Python

HashingThe efficient sorting algorithms presented in the previous articles all have their own advantages and drawbacks, and all of them – merge sort, heapsort, and quicksort – have an average complexity of O(n log n). Once a list has been sorted, it becomes relatively easy to find an item, using binary search techniques. But the overhead of doing the sort is an issue, and you may have been wondering if there is a way of avoiding this overhead.

In fact, there is a means of finding items quickly without having the overhead of sorting the list, albeit at the cost of having everything neatly organized into a sequentially-organized list. We can use a hash function. A hash function is any function that can be used to map digital data of arbitrary size to digital data of fixed size, with slight differences in input data producing very big differences in output data. The values returned by a hash function is called a hash value. Hash functions are primarily used in hash tables to quickly locate a data record. Typically, one field of the record will be designated as the search key. The hash function will be used to map the search key to an index. The index gives the place in the hash table where the corresponding record will be stored.

Implementing Hash Functions

To provide a concrete example, let’s assume we have a series of employee records for a company. Let’s also assume that the records consist of four fields: employee ID, name, job description, and number of years with the company. Assume that employee IDs are unique and that no ID is assigned to more than one worker. We have an employee record that looks like this:

[2175, ‘Homer Simpson’,’Technician’,20]

We decide to use the employee ID as the search key. This ID number is used as input for the hash function, which returns a value (e.g. 33). We store the record in the 33rd location in the hash table. Each location is known as a bucket. A very simple hash function might look something like this:

def hashElement(elem,num):
    if type(elem) == int:
        return elem%num

This function simply takes an integer as input, and returns the modulus of the integer, thus mapping the integer into a more limited range of numbers so we can store the data in a limited number of buckets. We can also generate an empty list, which will serve as our buckets:

def createBuckets(num):
    hashSet = []
    for i in range(num):
        hashSet.append([])
    return hashSet

In Python, we can nest lists inside lists, so it is very easy to represent a record as a list, and then store the list at a location within a list. Our insert function is coded as follows:

def insert(hashSet,i,num):
    hashSet[hashElement(i[0],num)].append(i)

The insert function takes as its parameters a list which contains the buckets, a second list representing the record to be stored in one of the buckets, and the number of buckets. The function calls hashElement to get the index at which to store the record, and appends the record to that location.

Note that it is possible for more than one record to hash to the same location. We call this a collision, and we deal with it by allowing multiple items at the same bucket. This complicates removal of records, since we have to make sure we don’t delete all records in the bucket, just the record that matches the one we want to delete:

def remove(hashSet,i,num):
    newElement = []
    for j in hashSet[hashElement(i,num)]:
        if j[0] != i:
            newElement.append(j)
    hashSet[hashElement(i,num)] = newElement

The remove function takes as arguments the list containing the buckets, an integer representing the employee ID of the employee whose record is to be deleted, and the number of buckets. The function iterates through the bucket containing the employee’s record, and copies all records except that record into a temporary location. It then copies the temporary list into the bucket location. We’ll provide one more function to take an employee ID number and return True if the record exists and False if it does not exist:

def member(hashSet,i,num):
    return i in hashSet[hashElement(i,num)]

Now we can write some code to test these functions:

>>> numBuckets = 51
>>> b = createBuckets(numBuckets)
>>> insert(b,[2175,’Homer Simpson’,’Technician’,20],numBuckets)
>>> print(b[hashElement(2175,numBuckets)])

Here we create a list with 51 buckets, insert a record into the list, and print the record. Running this code results in the following output:

[[2175, ‘Homer Simpson’, ‘Technician’, 20]]

Note that if we try to print out a nonexistent record, the hash function will still work, but the bucket may be empty:

>>> print(b[hashElement(101,numBuckets)])
[]

We could insert additional records if needed:

>>> insert(b,[4158,’Elmer Higgins’,’Engineer’,17],numBuckets)

It is possible to have a collision. If we insert this record:

>>> insert(b,[2583,’Waylon Smithers’,’Assistant’,25],numBuckets)

then there will be a collision at the bucket where we stored the first record:

>>> print(b[hashElement(2175,numBuckets)])
[[2175, ‘Homer Simpson’, ‘Technician’, 20], [2583, ‘Waylon Smithers’, ‘Assistant’, 25]]

But we can delete the first record and leave the other record in the bucket intact:

>>> remove(b,2175,numBuckets)
>>> print(b[hashElement(2583,numBuckets)])
[[2583, ‘Waylon Smithers’, ‘Assistant’, 25]]

When hashing data, we want to ensure that the hash function maps roughly the same number of inputs to each hash value. The method we are using where index = num % numBuckets should work pretty well in our case, where each input value (the employee ID) is equally likely. Another way of mapping the numbers to the bucket range might be to divide the key value by a number But in some cases, it would be less than ideal. For instance, most patrons of a supermarket will live in the same geographic area, so their phone numbers are likely to begin with the same 3 or 4 digits. In that case, the division formula will generate a lot of collisions. However, the modulus formula might work quite well, since it is more sensitive to the trailing digits.

One problem with our hashing functions is that the number of buckets is constantly being passed as a parameter. We could solve this problem by making numBuckets a global variable, but there is a better solution, which we will cover in the next article.

External Links:>

Hash function on Wikipedia

Python File I/O

Python file I/OIn the previous articles, we covered quite a bit of rudimentary Python programming, but we haven’t covered on of the most important elements of any programming language: the ability to perform file I/O. In this article, we will cover file operations, and put it into practice by using Python file I/O to generate a file of numbers to sort with the quicksort algorithm developed in the previous article.

Python File I/O: Simple Commands

To open a file, we use open(), which returns a file object, and is commonly used with two arguments: open(filename, mode).

>>> f = open(‘mylist.txt’,’w’)

The first argument is a string containing the filename. The second argument is another string describing how the file will be used. ‘r’ is specified when the file will only be read, ‘w’ for only writing (an existing file with the name name will be erased), and ‘a’ opens the file for appending. Any data written to the file is automatically added to the end. ‘r+’ opens the file for both reading and writing. The mode argument is optional: ‘r’ is the default value if it is omitted.

On Windows, ‘b’ appended to the mode opens the file in binary mode. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This modification is OK for ASCII text files, but it will corrupt binary data like that in JPEG or EXE files. On Unix, it doesn’t hurt to append a ‘b’ to the mode, so you can use it platform-independently for all binary files.

To read a file’s content, call the read() function of the object returned by open() (in the above example, f). read() reads some quantity of data and returns it as a string. When size is omitted or negative, the entire contents of the file, so you probably want to specify a number of bytes if the file is large. If the end of the file has been reached, f.read() will return an empty string:

>>> f.read()
‘This is the contents of a text file.\n’

readline() reads a single line from the file; a newline character is left at the end of the string. It is only omitted on the last line of the file if the file does not end in a newine. Thus if f.readline() returns an empty string, the end of file has been reached, while a blank line is represented by ‘\n’.

For readling lines from a file, you can loop over the file object:

>>> for line in f:
print line

If you want to read all the files of a file, you can also use list(f) or f.readlines().

That covers read operations. write(string) writes the contents of string to the file, returning None.

To write something other than a string, it needs to be converted to a string first; e.g.:

>>> value = (‘The sum is ‘,101)
>>> s = str(value)
>>> f.write(s)

f.tell() returns an integer giving the file object’s current position in the file, measured in bytes from the beginning of the file. To change the file’s object position, we use f.seek(offset, param). The position is computed from adding offset to a reference point. The reference point is selected by the param argument. A param value of 0 measures from the beginning of the file; 1 uses the current file position, and 2 uses the end of the file as the reference point. The second parameter can be omitted; the default value is 0, using the beginning of the file as a reference point. You may have noticed that the behavior of this function is similar to that of the fseek() function in C/C++. When you’re done with a file, call f.close() and free up any system resources taken up by the open file. After calling f.close(), attempts to use the file will automatically fail.

Python File I/O: Serialization

Finally, you can write data to a file easily with serialization. The read() function only returns strings, and when you want to save complex data types like nested lists and dictionaries, it can become unwieldy. Python, however, allows you to use the popular data interchange format called JSON (JavaScript Object Notation). The standard module called json can take Python data hierarchies, and convert them to string representations. This process is called serializing, and reconstructing the data from the string representation is called deserializing. To write an object x to a file object f opened for writing, we use:

json.dump(x, f)

And to decode the object, we use the following:

x = json.load(f)

This simple serialization technique can handle lists and dictionaries without any additional effort, but serializing arbitrary class instances in JSON requires some extra work.

Here’s a chance to apply what we know about Python file I/O to generate a file containing randomly-generated numbers. We will then read the data into a list, and use quicksort to sort the list. We start with a function to generate the file:

def generateNumbers(filename,num):
'''Use Python file I/O to generate a file of name 
filename and fill it with num randomly-generated 
integers of range 1 to num'''
temp = []
f = open(filename,'w')
for i in range(0,num):
temp.append(int((random.random()*num)+1))
json.dump(temp,f)
f.close()

As you can see, I used JSON to save the data to the file, which should make the process of reading the numbers in easier. Now for the function to read in the data:

def readList(filename):
'''Use Python file I/O to read in a file of name 
filename using serialization'''
retval = []
f = open(filename,'r')
temp = []
try:
retval = json.load(f)
except EOFError:
print('EOF error')
f.close()
return retval

The following code uses Python file I/O to generate a list of 1000 integers, reads in the same list, sorts the list and prints out both the size of the list along with the sorted list:

generateNumbers('mylist.lst',1000)
myList = readList('mylist.lst')
quickSort(myList)
print('Size of myList = ',len(myList))
print(myList)

We could save the list of integers as ASCII text, but if we did, we would have to parse the list when we read it back in, which would be somewhat cumbersome. Serialization makes the process easy, and it is undoubtedly a technique we will use in the future.

External Links:

Reading and writing Files at docs.python.org

Quicksort: A Python Implementation

QuicksortIn the previous articles, we introduced sorting algorithms and went through the process of coding two efficient sorting algorithms: merge sort and heapsort. In this article, we’ll take a look at one more efficient sorting algorithm: quicksort.

The quicksort algorithm was developed in 1960 by Tony Hoare, and has since gained widespread adoption. A quicksort function has been part of C since 1975, and was standardized as part of ANSI C in 1989. Python does not have a build-in quicksort function, but quicksort is easy enough to code.

Implementing the Quicksort Algorithm

Like all the other efficient sorting algorithms covered in this series of articles, the quicksort algorithm is a recursive, divide and conquer algorithm. It is based on the following steps:

  1. Pick an element (called a pivot) from the list.
  2. Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it. Equal values can go either way.
  3. Apply the above steps recursively to each half of the list.

The pseudocode for this algorithm is as follows:

quicksort(L, low, high):
	if low < high
		p = partition(L, low, high)
		quicksort(L, low, p-1)
		quicksort(L, p+1, high)

The manner in which the pivot is chosen can be key. In some implementations, the leftmost element of the partition would often be chosen as the pivot. This would lead to worst case behavior in already-sorted lists. This illustrates the basic problem with quicksort: if we keep choosing the lowest element (or highest element) as the pivot, then quicksort becomes like a selection sort. One possible alternative is to choose the midpoint, which generally leads to good average case behavior. One possibility, suggested by computer scientist Robert Sedgewick, is to choose the median of the first element, the last element, and the midpoint. This guarantees a better estimate of the optimal pivot in an unsorted list. Here is my Python function for choosing a pivot:

def choosePivot(L,low=0,high=-1):
    if high == -1:
        high = len(L)-1
    mid = low+int((high-low)/2)
    # Return the median of low, middle, and high
    if L[low] < L[mid] and L[low] < L[high] and L[mid] < L[high]:
        return mid
    elif L[mid] < L[low] and L[mid] < L[high] and L[low] < L[high]:
        return low
    else:
        return high

Most of the work will be done by the partitioning function, which performs the following steps:

  1. Choose a pivot.
  2. Move the pivot to the end of the list, putting it out of the way (temporarily).
  3. Initialize two indices, i and j, to the index of the first list element.
  4. If list[i] is less than the pivot, swap it with list[j] and increment j.
  5. Increment i, and repeat step [4] as long as i is less than the index of the last list element.
  6. Swap the pivot (the last list element) with list[j]. Now the list is partitioned.

The Python representation of this function is as follows:

def partition(L,low=0,high=-1):
    if high == -1:
        high = len(L)-1
    # Select a pivot
    pivot = choosePivot(L,low,high)
    # Put the pivot at the end where it
    # will be out of the way
    swap(L,pivot,high)
    # Separate the list into elements 
    # less than and greater than pivot
    j = low
    for i in range(low,high+1):
        if L[i] < L[high]:
            swap(L,i,j)
            j += 1
    # Put the pivot between the two halves
    swap(L,high,j)
    return j

The swap function from heapsort is re-used:

def swap(L,x,y):
    temp = L[x]
    L[x] = L[y]
    L[y] = temp

Now that we have functions to choose the pivot and partition the list, writing the code for the quicksort function is relatively easy:

def quickSort(L,low=0,high=-1):
   '''Sort a list by selecting a "pivot", dividing
   the list into elements less than and greater than
   the pivot, then apply the sort recursively on the
   two list halves'''
   if high == -1:
      high = len(L)-1
   if high > low:
      # Partition into two halves, then sort
      # each half
      pivot = partition(L,low,high)
      # Check to see if pivot > 0; necessary
      # because if pivot == 0, then high = -1
      # and whole list is sorted
      if pivot > 0:
         quickSort(L,low,pivot-1)
      quickSort(L,pivot+1,high)

Recursion Errors (and a Solution)

This implementation of the function will work most of the time, but during testing, I encountered a potential issue. Assume that you have an unsorted list with a lot of duplicates – for example, the following list:

[3, 4, 2, 3, 1, 4, 2, 2, 3, 1, 3, 1, 2, 4, 1, 3, 2, 1, 3, 2]

After a few partitionings, you will likely end up with smaller sublists of all the same number. In such a case, the pivot value returned will be zero. If we allow the recursion to continue, quicksort will be called on a list of n-1 items, then n-2 items, and so on. If the lists are big enough, Python will reach the maximum allowed number of levels of recursion and the interpreter will return an error. Since we don’t want this to happen, I made a further modification to the partition function. After we are done partitioning the list, we will iterate through the list. If there are any two items that are not equal, we will stop iterating through the list and return the pivot value. If not, we will return -1, which will end any further recursive calls to quicksort. The new partition function looks like this:

def partition(L,low=0,high=-1):
    '''partition function for quicksort
    implementation'''
    if high == -1:
        high = len(L)-1
    # Select a pivot
    pivot = choosePivot(L,low,high)
    # Put the pivot at the end where it
    # will be out of the way
    swap(L,pivot,high)
    # Separate the list into elements 
    # less than and greater than pivot
    j = low
    for i in range(low,high+1):
        if L[i] < L[high]:
            swap(L,i,j)
            j += 1
    # Put the pivot between the two halves
    swap(L,high,j)
    for i in range(low,high):
        if L[i] != L[i+1]:
            return j
    return -1

In the worst-case scenario, we have to look through the entire list. Thus, instead of having a maximum of N comparisons for a list of size N, we will have 2N comparisons, but at least we avoid exponential growth in the number of operations, and this will prevent our program from crashing when there are a lot of duplicate items.

I also had to make a slight modification to the quicksort function. Here is the newly-modified function:

def quickSort(L,low=0,high=-1):
    '''quicksort: Sort a list by selecting a "pivot", 
    dividing the list into elements less than and greater 
    than the pivot, then apply the sort recursively on the 
    two list halves'''
    if high == -1:
        high = len(L)-1
    if high > low: 
        # Partition into two halves, then sort
        # each half
        pivot = partition(L,low,high)
        # Check to see if pivot > 0; necessary
        # because if pivot == 0, then high = -1
        # and whole list is sorted
        if pivot > 0:
            quickSort(L,low,pivot-1)
        if pivot > -1:
            quickSort(L,pivot+1,high)

This should work fairly well, but obviously, I may have overlooked something, so if anyone has any further suggestions for improvements to this quicksort implementation, I would love to hear them. The source code for this implementation can be downloaded here.

Built in Sort Functions: Python and C

Finally, it should be mentioned that Python lists have their own sort function. In our example, if we had a list called myList, we could make the following call to sort the list:

myList.sort()

sort takes up to three optional parameters in Python 2.7: cmp, key, and reverse. cmp specifies a function of two arguments (list items) which should return a negative, zero, or positive number depending on whether the first argument is considered smaller than, equal to, or greater than the second argument. key specifies a function of one argument that is used to extract a comparison key from each list element (e.g. str.lower). Reverse is a boolean value. If set to True, then the list elements are sorted as if each comparison were reversed. Newer versions of Python have eliminated the first argument (cmp), which should not be a major issue, since you can still sort the list in order (or reverse order using the “reverse” parameter). The key paramter is potentially useful in a number of cases. Assume we have a list of lists, where each of the nested lists represents an employee record. We want to sort the records by employee ID, but the employee ID is the third item in each record. To sort by employee ID, all we need to do is define a key function:

def getKey(a):
    return a[2]

and call the sort function with getKey for the key parameter.

Quicksort is probably the easiest of the efficient sorting algorithms covered so far to understand and code. And in C/C++, we don’t even have to code it, as we are provided with the qsort function, which takes 4 arguments: a pointer to the array to be sorted; the number of items to be compared; the size of each item in the array, and a pointer to a function to compare two items. The comparison function should have as parameters pointers to the two items being compared, and should return -1 if the first item is less than the second, 0 if they are equal, and 1 if the first item is greater than the second (in order to sort the array in ascending order). Reverse the return values to sort the array in descending order.

External Links:

Quicksort on Wikipedia

Qsort page on cplusplus.com

Heapsort: Part Two

HeapsortIn the previous article, we introduced the heapsort algorithm and coded several helper functions in Python. In this article, we will finally code the heapsort function and test it on some input data.

To recap, we already introduced the following functions:

  • heapify: Turns a list into a binary heap.
  • rebuildHeap: Accepts as input a heap with the root node deleted and rebuilds the heap.
  • consolidateHeap: Moves all empty nodes to the bottom of the heap so we can shrink the heap.
  • swap: Swaps the contents of two variables.

Since we did all the heavy lifting in writing the helper functions, coding heapSort is rather simple:

def heapSort(L):
   '''Given a list, perform a binary sort on it
   '''
   # First, reconstitute list as a binary heap: 
   heapify(L)
   high = len(L)
   # While there is still data in the heap,
   # remove the root node, rebuild and
   # consolidate the heap, and move the old
   # root node to the end
   while high > 0:
      # Delete root node; save in temp
      temp = L[0]
      L[0] = None
      # Rebuild and consolidate heap
      rebuildHeap(L,0,high)
      consolidateHeap(L,high)
      high -= 1
      # Move old root node to end
      L[high] = temp

As you can see, once we create the heap, all we are doing is [1] deleting the root node from the heap by setting its value to “None”; [2] rebuild the heap and consolidate the heap; [3] decrement the variable indicating the size of the heap; [4] move the old root node back into the list right after the heap; [5] repeat until the heap size is 0. You may have noticed that when we call rebuildHeap and consolidateHeap, we specify the size of the heap. This is to ensure the sorted list which we are inserting at the end of the list is not treated as part of the heap.

To test our heapsort function, I reused the list from the previous article, and used the following code:

>>> myList = [123,79,29,997,519,17,239,144,51,372]
>>> heapSort(myList)
>>> print(myList)
[17, 29, 51, 79, 123, 144, 239, 372, 519, 997]

As you can see, the heapsort function seems to work, and has the added advantage (over merge sort) of supporting the ability to sort in place. In the next article, we’ll examine one more efficient sorting algorithm: quicksort.

The source code for this heapsort implementation can be found here.

External Links:

Heapsort on Wikipedia

Heapsort: Part One

 

 

HeapsortIn the previous article, we applied the concept of recursion to implementing an efficient sorting algorithm (merge sort). In this article, we will apply our knowledge of Python to a different sorting algorithm: the heapsort.

Heapsort involves taking an unsorted list and first forming a binary heap from the list. A binary heap is a data structure in which each node has at most two child nodes. The parent node is always greater than either child node. Therefore, the root node (the only node that is not a child of any node) must be greater than any other node. We can thus sort the list by removing the root node from the heap and then rebuilding the heap, which is now one item smaller. We can insert the deleted root node after the heap, and thus sort the list in place.

We are going to need several functions in order to make this all work:

  1. The main heapSort function.
  2. A helper function to generate the binary heap.
  3. A helper function to rebuild the binary heap after we delete the root node.

The pseudocode for heapsort looks like this:

heapsort(myList):
   heapify(myList)
   high = len(myList)
   while high > 0:
      temp = myList[0]
      rebuildHeap(myList)
      high -= 1
      myList[high] = temp

Implementing Heapsort: The heapify Function

The heapify function, which simply creates a binary heap from a list of items, is easy to code, so perhaps we should start with it:

def heapify(L,root=0):
   # Given a list and the index of the root node,
   # form a heap (for heapsort)
   if len(L)-root-1 > 0:
      lnode = root*2+1
   i = lnode
   nsiblings = 2
   while i < len(L): if i + nsiblings > len(L):
      nsiblings = len(L)-i
      for j in range(i,i+nsiblings):
      if L[root] < L[j]:
         swap(L,root,j)
      i = i*2+1
      nsiblings *= 2
      if lnode < len(L):
         heapify(L,lnode)
      if lnode+1 < len(L): 
         heapify(L,lnode+1) 

As simple as it is, we’re probably going to do a lot more swap operations, so I decided to make a function for it:

def swap(L,x,y): 
   # Swap two items 
   temp = L[x] 
   L[x] = L[y] 
   L[y] = temp 

This code does not require much explanation. The heapify function takes two arguments: a list and the index of the root node (default value is 0). If the difference between the first and last index is less than or equal to zero, then there are less than two list entries, and the input list is already a binary heap. If not, we iterate through the remaining items, starting with the left and right child nodes, then moving on to the children of those nodes, and so on, skipping the nodes that are not children of the root node. When we are done, the correct value is in the root node, and all that remains to be done is to apply the function recursively to the left node (lnode) and the right node (lnode+1). It’s good practice to test each function individually, so let’s provide some test input for heapify:

>>> mylist = [1,2,3,4,5,6,7,8,9,10]
>>> heapify(mylist)
>>> mylist
[10, 9, 6, 7, 8, 2, 5, 1, 4, 3]
>>>

If we represent this heap graphically, we can confirm that it is a valid binary heap:

     10
    /  \
   9    6
  /\   / \
 7  8  2  5
 /\ /
1 4 3

Implementing Heapsort: rebuildHeap and consolidateHeap

We could write our heapsort algorithm by just removing the root node from the heap and running heapify on the remaining items. That would require many more operations, however than is necessary. Instead, we will write a separate rebuildHeap function that does the following:

  1. Make the higher-valued child node the new root node.
  2. If the left node was promoted, run rebuildHeap recursively on the left node. If the right node was promoted, run rebuildHeap recursively on the right node.

You are probably wondering how we are going to denote a deleted node. Fortunately, Python has the built-in value “None” which is essentially the Python equivalent of NULL. If we want to delete a node, we will change its value to None:

>>> mylist[0] = None

This deletes the root node of the heap, and we now must rebuild the heap. Now is a good time to introduce the rebuildHeap function:

def rebuildHeap(L,root=0,size=-1):
   '''Given a heap with the root node deleted,
   restore the heap (for heapsort)'''
   if size == -1:
      size = len(L)
   lnode = root*2+1
   if size-root-1 > 0 and lnode < size: 
      # If right node does not exist or left node 
      # is greater than right node, promote left 
      # node 
      if lnode + 1 >= size or L[lnode] >= L[lnode+1]:
         L[root] = L[lnode]
         L[lnode] = None
         rebuildHeap(L,lnode,size)
      # Else if right node exists, promote right node
      elif lnode+1 < size:
         L[root] = L[lnode+1]
         L[lnode+1] = None
         rebuildHeap(L,lnode+1,size)

The rebuildHeap function is even simpler than the heapify function. We take three arguments: the list, the index of the parent node, and the size of the list. We provide default values for parent and size, so if we just want to rebuild the entire heap, we just need to specify the list. If the heap has at least 2 items, we promote the left node if either [1] there is no right node or [2] its value is greater than the right node. Otherwise, if the right node exists, we promote the right node. In either case, we have disturbed the heap integrity on one side by promoting a node, so we run the function recursively on whichever side we promoted. Note that whenever we promote a node, we leave a node with a value of “None” in its place, and at the end, we will have a childless node with a value of “None” as a placeholder.

We now have functions to both create a heap and rebuild the heap, but we don’t have any means of shrinking the heap. This is unfortunate, as we really want to be able to sort the list in place, and insert entries after the heap. But we cannot do this, because the empty nodes are not always at the end of the heap. We can illustrate the problem by running the following code:

myList = [123,79,29,997,519,17,239,144,51,372]
heapify(myList)
print(myList)
while myList[0] != None:
myList[0] = None
rebuildHeap(myList)
print(myList)

This code creates a list, turns it into a binary heap and prints out the initial heap. On each pass through the loop, we delete the root node, then rebuild the heap and print the results until there is nothing left in the heap. Running this code yields the following result:

[997, 519, 239, 144, 372, 17, 29, 79, 51, 123]
[519, 372, 239, 144, 123, 17, 29, 79, 51, None]
[372, 144, 239, 79, 123, 17, 29, None, 51, None]
[239, 144, 29, 79, 123, 17, None, None, 51, None]
[144, 123, 29, 79, None, 17, None, None, 51, None]
[123, 79, 29, 51, None, 17, None, None, None, None]
[79, 51, 29, None, None, 17, None, None, None, None]
[51, None, 29, None, None, 17, None, None, None, None]
[29, None, 17, None, None, None, None, None, None, None]
[17, None, None, None, None, None, None, None, None, None]
[None, None, None, None, None, None, None, None, None, None]

As you can see, the empty nodes are interspersed with valid data. We need a means of consolidating the heap. The consolidation routine for our heapsort algorithm involves the following:

  1. Starting at the end of the heap, look for an empty node.
  2. If an empty node is found, start at the end of the heap and look for a valid node.
  3. Swap the empty and valid node.
  4. Check if the valid node’s new parent’s value is less than that of the newly-moved node. If so, swap them and repeat the process of comparing the parent’s value to the child’s, substituting the new parent node. Continue until the child is less than parent or we have reached the root node.
  5. Repeat the entire process until there are no more nodes to check.
def consolidateHeap(L,size=-1):
   '''Consolidate nodes in the
   heap; swap where necessary (for heapsort)'''
   if size == -1:
      size = len(L)
   i = size-2
   while i > 0:
      # Check for empty nodes
      if L[i] == None:
         child = i
         j = size-1
         # Find the last non-empty node
         # in the heap and swap
         while L[j] == None and j > 0:
            j -= 1
         if (j > 0):
            swap(L,child,j)
         # Maintain heap integrity by swapping
         # parent and child nodes if necessary
         parent = int((child-1)/2)
         while parent > 0 and L[parent] < L[child]:
            swap(L,parent,child)
            child = parent
            parent = int((child-1)/2)
      i -= 1

This function is a little more complex than the others, but running our test with the consolidateHeap function inserted after rebuildHeap yields the following result:

[997, 519, 239, 144, 372, 17, 29, 79, 51, 123]
[519, 372, 239, 144, 123, 17, 29, 79, 51, None]
[372, 144, 239, 79, 123, 17, 29, 51, None, None]
[239, 144, 51, 79, 123, 17, 29, None, None, None]
[144, 123, 51, 79, 29, 17, None, None, None, None]
[123, 79, 51, 29, 17, None, None, None, None, None]
[79, 29, 51, 17, None, None, None, None, None, None]
[51, 29, 17, None, None, None, None, None, None, None]
[29, 17, None, None, None, None, None, None, None, None]
[17, None, None, None, None, None, None, None, None, None]
[None, None, None, None, None, None, None, None, None, None]

Thus, the consolidateHeap function does what it is supposed to do, and we can sort in place with heapsort. In the next article, we will use these functions together to implement heapsort.

External Links:

Heapsort on Wikipedia

Merge Sort: A Python Implementation

Merge sortIn the previous article, we introduced the concept of modules and functions, and even created a simple Python program to test for palindromes. In this article, we will continue our look at recursion and put some of the concepts introduced in previous articles into practice by coding the merge sort algorithm in Python.

For this article, we need to introduce the concept of a sorting algorithm. A common problem in computer science is the need to sort a list of items. A sorting algorithm is an algorithm that puts elements of a list into order. The most-used orders are numerical order and lexicographical order. But sorting a list efficiently has been problematic. Early search algorithms increased exponentially with the size of the list. They were on an order of n squared; that is, if the number of list items doubled, the number of list operations increased fourfold; if the list size quadrupled, the number of list operations increased sixteen-fold, and so on.

One of the simplest sort algorithms consists of simply going through a list of unsorted items sequentially, comparing the item to all remaining items, and swapping items that are out of place. We start at the first item, and compare the item to all other items, swapping the first item with the item to which we are comparing it when necessary. Once we have reached the end of the list, we know the first item of the list has the lowest value of any item on the list, and we can move on to the second item. We keep going, updating the list pointer, until we reach the end of the list. This is called a selection sort, and it is simple to implement:

def selectionSort(L):
	'''Implement a simple selection sort:
        (Not as effecient as merge sort)
	Compare each element with each subsequent
	item on the list; if they are out of place,
	swap them; keep going until we rech the end 
	of the list'''
	for i in range(0,len(L)-1):
		for j in range(i+1,len(L)):
			# If items out of place, switch
			if L[i] > L[j]:
				temp = L[i]
				L[i] = L[j]
				L[j] = temp

In this function, we have two for loops, one nested inside the other one. If the lower-indexed list member is greater than the higher-indexed one, we swap them. Although this will work, you can see the problem with selection sorts. For a list of n items, for each item, we must carry out as many comparisons as the current list position minus one. For the first list item, we have n-1 comparisons; for the second list item, n-2, and so on. There is an average of n/2 comparisons for each item, and we make n-1 passes through the list. That leaves us with (n^2-n)/2 operations. Dropping out lower terms, this means the bubble sort is on the order of O(n^2). This is far from ideal, and generally bubble sorts are only acceptable for relatively small lists.

Merge Sort: An Efficient Sorting Algorithm

One possible alternative is the merge sort algorithm. Merge sort works as follows:

  1. Divide the list into two equally-sized sublists.
  2. Apply the merge sort algorithm recursively on the sublists.
  3. Merge the two sorted lists.

Recursion requires a base case, and in merge sort, the base case is a list size of 1. In this case, the list is already sorted. On all other levels of recursion we have to merge the sorted lists. Merging the lists requires us to move through each list, placing each item into a temporary location. At each level where there is a merge operation taking place, a total of n operations are carried out. Since we are halfing the lists at each level of recursion, however, there are only log(n) levels. To confirm this is the case, let us consider the case where there are 8 items on the list:

  • First level: Divide the list into two lists of four.
  • Second level: Divide the two lists of four into four lists of two.
  • Third level: Divide the lists in half – we have reached the base case. Merge together the lists of one. There are eight lists of one to merge into four lists of two, so there are 8*1 = 8 operations.
  • Second level: Merge the four lists of two into two lists of four. There are four lists of two, so there are 4*2 = 8 operations.
  • First level: Merge the two lists of four into a single list. There are two lists of four, so there are 8 operations.

For a list of 8 items, there are 4 levels of recursion, but the fourth level is the base case, so there are 3 levels at which a merge is performed. Doubling the size of the list would add another level of recursion. Thus, the number of levels is log(n). Multiply that by the number of operations at each level and we get n log(n) operations.

This is an improvement over the selection sort, so it may be worth implementing. The pseudocode for the merge sort function is relatively easy to write:

mergesort(mylist,low,high):
if low < high
mergesort(mylist,low,low+(high-low)/2)
mergesort(mylist,low+((high-low)/2)+1,high)
merge lists into temporary location
copy merged list into mylist

As we apply the merge sort function recursively, eventually low equals high, ending the recursion. The actual Python function is slightly harder to code, but not by much:

def mergeSort(L,low=int(0),high=int(-1)):
    # Implementation of the merge sort algorithm
    # If high has default value, set it to len(L)-1
    # (index of last item on the list)
    if high == -1:
        high = len(L)-1
    if low < high:
	# Apply mergeSort on each half of the list
        mergeSort(L,low,int((low+(high-low)/2)))
        mergeSort(L,low+int(((high-low)/2))+1,high)
	# Now, merge the lists in temporary location
        i = low
        j = low+int(((high-low)/2))+1
        lb = j
        k = 0
        temp = []
        while k <= high-low:
            if i < lb and j <= high and L[i] > L[j]:
                temp.append(L[j])
                j = j + 1
            elif i < lb and j <= high and L[i] <= L [j]:  
                temp.append(L[i])                       
                i = i + 1             
            elif j > high:
                temp.append(L[i])
                i = i + 1
            elif i >= lb:
                temp.append(L[j])
                j = j + 1
            k = k + 1
        # Copy the list to its original location
        j = 0
        for i in range(low,high+1):
            L[i] = temp[j]
            j = j + 1

Our mergeSort function takes three arguments: the list itself and the lower and upper indices, indicating which subset of the list to sort. I wanted to be able to call the function with just one parameter (the list) in cases where I wanted to sort the entire list, so I added default values for low and high: the default value for low should be 0 obviously, but the value for high depends on the size of the list. Since we can’t use len(L)-1 as the default value for high (since L is itself a formal parameter), I used -1 as the default value for high since it is obviously out of bounds. If high is -1, we set it to len(L)-1. It’s a kludge, but it serves its purpose.

Next, we check to see if low is less than high. If they are equal, we have reached a base case. If not, then we call the merge sort function recursively on each half of the list. When we are done, we merge the two lists. Merging the lists entails the following:

  1. Create a temporary list to store the merged list.
  2. Create and initialize indices for both lists (i and j in our example).
  3. Compare L[i] to L[j]. Place the smaller item into the temporary list.
  4. Increment the index for the list whose item was put in the temporary list.
  5. If the end of either list has not been reached, go back to [3].
  6. Place any remaining list items in the temporary list.
  7. Copy the temporary list into L.

One of the drawbacks of merge sort is that you have to allocate temporary space to initially store the merged lists. One possible alternative is a binary heap, which we will examine in the next article.

The merge sort source code is available here.

External Links:

Sorting algorithm at Wikipedia

Merge sort at Wikipedia

Python Modules; Introduction to Recursion

  Python moduleIf you quit the Python interpreter and enter it again without saving your program to a text file, the definitions you have made will be lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead. In doing so, we create scripts. As your program gets longer, you may also want to split it into several files for easier maintenance.

Python Modules

To support this, Python provides a way of putting definitions into a file and using them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a Python module can be imported into other modules or into the main Python modules.

A Python module is a file containing Python definitions and statements; such a file always ends with the suffix .py. Up to this point, we have not introduced Python definitions. A definition in Python always starts with def and is followed by its name. This is the equivalent of a function in C or Pascal. For example:

def hello():
        print(‘Hello, world!’)

is a very simple Python function to print out “Hello, world!”. If this code is saved in a file called hello.py, and is in Python’s path, you can import it at the Python command line:

>>> import hello

Once the Python module is imported, you can invoke the hello function with:

>>> hello.hello()

As with C and other languages, you can specify parameters. For example:

def isZero(a):
    if a == 0:
       print('a is zero')
    else:
       print('a is a non-zero number')

You can also return a value from the function:

def isZero(a):
     if a == 0:
        return True
     else:
        return False

As in other languages such as C/C++, we can specify default parameters. For example:

def isZero(a=0):

allows us to invoke the function isZero with no arguments; the interpreter will insert a value of 0 for a if no arguments are specified. However, a non-default argument cannot follow a default argument. Thus:

def isZero(a=0,b):

is not allowed, but:

def isZero(a,b=0):

is allowed.

This is a decent start, but it would be nice if we came up with a program that can do something useful.

Introduction to Recursion

In computer science, recursion is a method where the solution to a problem depends on solutions to smaller instances of the same problem. Recursion always involves the existence of one more more base cases in which an operation can be done directly on the input data. The other cases involve invoking the same function on a subset of the input data in a divide-and-conquer strategy. The approach can be applied to many types of problems – for example, finding palindromes. It shouldn’t be too difficult to come up with a Python module to solve this problem.

A palindrome is a word, phrase, number, or other sequence of symbols or elements that read the same forward or reversed: for example, “Race car”, or “A man, a plan, a canal – Panama”. A solution of the problem for finding a palindrome using recursion can be outlined as follows:

  1. If the input string length is 0 or 1, then we have a palindrome – return true
  2. If the input string length is greater than 1 but the first and last character match, apply the test recursively to the string minus the first and last characters
  3. If [1] and [2] don’t apply, then we don’t have a palindrome – return false

Successive applications of this process on the input string will eventually yield either a mismatch between the first and last character or an input string of length 0 or 1, and the test will be complete. We can code this algorithm in Python as follows:

def isPalindrome(myString):
  ''' Simple program to find palindromes, part of our Python module
  Parameters: myString => string to perform test on
  If length <= 1, it's a palindrome - return True
  If first char is the same as the last char, apply algorithm recursively
  Otherwise return False '''
  if len(myString) <= 1:
     return True
   elif myString[0].lower() == myString[len(myString)-1].lower():
     return isPalindrome(myString[1:(len(myString)-1)])
  return False
 

Our isPalindrome function takes in a single argument, myString. We introduced two hitherto unseen functions here. len() takes a single parameter – a string – and returns the length. lower() is a member of the string class and converts the string into lowercase. This ensures that our test is not case-sensitive. You may have noticed that there is one shortcoming of this algorithm: if there are spaces or any other alphanumeric content in the string, it will return false. I decided it would be easier to write a separate function to strip the non-alphanumeric characters out:

 def convertToAlphaNum(myString):
  ''' Iterate through the string and generate an output string with only the alphanumeric chars (also part of the palindrome.py Python module)
  Parameters: myString => the string to convert
  Returns: A copy of the string with all non-alphanumeric
  characters removed '''
  retval = ''
  for c in myString:
      if c.isalpha() or c.isdigit():
          retval += c
  return retval

All this function does is iterate through the input string and if a character is a letter or digit, it gets added to a new string. When it is done, the function returns the new string. We still need a function to read input from the user and call these functions, so that will be our next bit of code, and the last function in our Python module:

def testPalindrome():
 ''' Part of the palindrome. py Python module - Prompt user for string input
 Output whether it is a palindrome or not '''
 myString = str(input('Enter a string: '))
 if isPalindrome(convertToAlphaNum(myString)):
    print(myString,'is a palindrome')
 else:
    print(myString,'is not a palindrome')

This function simply prompts the user to input a string and uses the previous functions to determine whether or not it is a palindrome, and prints the results.

Once these functions are saved to a file, you can load the file into IDLE using the File -> Open menu option (or CTRL-O). The Python module will load into a separate window; from that window, select Run -> Run Module (or press F5). Then from the main IDLE window, you can run testPalindrome():

>>> testPalindrome()
Enter a string: sample text
sample text is not a palindrome
>>> testPalindrome()
Enter a string: A man, a plan, a canal – Panama
A man, a plan, a canal – Panama is a palindrome
>>> testPalindrome()
Enter a string: No ‘x’ in Nixon
No ‘x’ in Nixon is a palindrome
>>> testPalindrome()
Enter a string: No ‘x’ in Ford
No ‘x’ in Ford is not a palindrome
>>> testPalindrome()
Enter a string: Able I was, ere saw I Elba
Able I was, ere saw I Elba is a palindrome

The source code for for these functions is available as a single Python module, via this link.

It’s not the most elegant solution, but it does seem to work. In the next article, we will continue our look at programming in Python, including a second look at using recursion to solve problems.

External Links:

Python documentation from the official Python website