Heapsort: Part One

 

 

HeapsortIn the previous article, we applied the concept of recursion to implementing an efficient sorting algorithm (merge sort). In this article, we will apply our knowledge of Python to a different sorting algorithm: the heapsort.

Heapsort involves taking an unsorted list and first forming a binary heap from the list. A binary heap is a data structure in which each node has at most two child nodes. The parent node is always greater than either child node. Therefore, the root node (the only node that is not a child of any node) must be greater than any other node. We can thus sort the list by removing the root node from the heap and then rebuilding the heap, which is now one item smaller. We can insert the deleted root node after the heap, and thus sort the list in place.

We are going to need several functions in order to make this all work:

  1. The main heapSort function.
  2. A helper function to generate the binary heap.
  3. A helper function to rebuild the binary heap after we delete the root node.

The pseudocode for heapsort looks like this:

heapsort(myList):
   heapify(myList)
   high = len(myList)
   while high > 0:
      temp = myList[0]
      rebuildHeap(myList)
      high -= 1
      myList[high] = temp

Implementing Heapsort: The heapify Function

The heapify function, which simply creates a binary heap from a list of items, is easy to code, so perhaps we should start with it:

def heapify(L,root=0):
   # Given a list and the index of the root node,
   # form a heap (for heapsort)
   if len(L)-root-1 > 0:
      lnode = root*2+1
   i = lnode
   nsiblings = 2
   while i < len(L): if i + nsiblings > len(L):
      nsiblings = len(L)-i
      for j in range(i,i+nsiblings):
      if L[root] < L[j]:
         swap(L,root,j)
      i = i*2+1
      nsiblings *= 2
      if lnode < len(L):
         heapify(L,lnode)
      if lnode+1 < len(L): 
         heapify(L,lnode+1) 

As simple as it is, we’re probably going to do a lot more swap operations, so I decided to make a function for it:

def swap(L,x,y): 
   # Swap two items 
   temp = L[x] 
   L[x] = L[y] 
   L[y] = temp 

This code does not require much explanation. The heapify function takes two arguments: a list and the index of the root node (default value is 0). If the difference between the first and last index is less than or equal to zero, then there are less than two list entries, and the input list is already a binary heap. If not, we iterate through the remaining items, starting with the left and right child nodes, then moving on to the children of those nodes, and so on, skipping the nodes that are not children of the root node. When we are done, the correct value is in the root node, and all that remains to be done is to apply the function recursively to the left node (lnode) and the right node (lnode+1). It’s good practice to test each function individually, so let’s provide some test input for heapify:

>>> mylist = [1,2,3,4,5,6,7,8,9,10]
>>> heapify(mylist)
>>> mylist
[10, 9, 6, 7, 8, 2, 5, 1, 4, 3]
>>>

If we represent this heap graphically, we can confirm that it is a valid binary heap:

     10
    /  \
   9    6
  /\   / \
 7  8  2  5
 /\ /
1 4 3

Implementing Heapsort: rebuildHeap and consolidateHeap

We could write our heapsort algorithm by just removing the root node from the heap and running heapify on the remaining items. That would require many more operations, however than is necessary. Instead, we will write a separate rebuildHeap function that does the following:

  1. Make the higher-valued child node the new root node.
  2. If the left node was promoted, run rebuildHeap recursively on the left node. If the right node was promoted, run rebuildHeap recursively on the right node.

You are probably wondering how we are going to denote a deleted node. Fortunately, Python has the built-in value “None” which is essentially the Python equivalent of NULL. If we want to delete a node, we will change its value to None:

>>> mylist[0] = None

This deletes the root node of the heap, and we now must rebuild the heap. Now is a good time to introduce the rebuildHeap function:

def rebuildHeap(L,root=0,size=-1):
   '''Given a heap with the root node deleted,
   restore the heap (for heapsort)'''
   if size == -1:
      size = len(L)
   lnode = root*2+1
   if size-root-1 > 0 and lnode < size: 
      # If right node does not exist or left node 
      # is greater than right node, promote left 
      # node 
      if lnode + 1 >= size or L[lnode] >= L[lnode+1]:
         L[root] = L[lnode]
         L[lnode] = None
         rebuildHeap(L,lnode,size)
      # Else if right node exists, promote right node
      elif lnode+1 < size:
         L[root] = L[lnode+1]
         L[lnode+1] = None
         rebuildHeap(L,lnode+1,size)

The rebuildHeap function is even simpler than the heapify function. We take three arguments: the list, the index of the parent node, and the size of the list. We provide default values for parent and size, so if we just want to rebuild the entire heap, we just need to specify the list. If the heap has at least 2 items, we promote the left node if either [1] there is no right node or [2] its value is greater than the right node. Otherwise, if the right node exists, we promote the right node. In either case, we have disturbed the heap integrity on one side by promoting a node, so we run the function recursively on whichever side we promoted. Note that whenever we promote a node, we leave a node with a value of “None” in its place, and at the end, we will have a childless node with a value of “None” as a placeholder.

We now have functions to both create a heap and rebuild the heap, but we don’t have any means of shrinking the heap. This is unfortunate, as we really want to be able to sort the list in place, and insert entries after the heap. But we cannot do this, because the empty nodes are not always at the end of the heap. We can illustrate the problem by running the following code:

myList = [123,79,29,997,519,17,239,144,51,372]
heapify(myList)
print(myList)
while myList[0] != None:
myList[0] = None
rebuildHeap(myList)
print(myList)

This code creates a list, turns it into a binary heap and prints out the initial heap. On each pass through the loop, we delete the root node, then rebuild the heap and print the results until there is nothing left in the heap. Running this code yields the following result:

[997, 519, 239, 144, 372, 17, 29, 79, 51, 123]
[519, 372, 239, 144, 123, 17, 29, 79, 51, None]
[372, 144, 239, 79, 123, 17, 29, None, 51, None]
[239, 144, 29, 79, 123, 17, None, None, 51, None]
[144, 123, 29, 79, None, 17, None, None, 51, None]
[123, 79, 29, 51, None, 17, None, None, None, None]
[79, 51, 29, None, None, 17, None, None, None, None]
[51, None, 29, None, None, 17, None, None, None, None]
[29, None, 17, None, None, None, None, None, None, None]
[17, None, None, None, None, None, None, None, None, None]
[None, None, None, None, None, None, None, None, None, None]

As you can see, the empty nodes are interspersed with valid data. We need a means of consolidating the heap. The consolidation routine for our heapsort algorithm involves the following:

  1. Starting at the end of the heap, look for an empty node.
  2. If an empty node is found, start at the end of the heap and look for a valid node.
  3. Swap the empty and valid node.
  4. Check if the valid node’s new parent’s value is less than that of the newly-moved node. If so, swap them and repeat the process of comparing the parent’s value to the child’s, substituting the new parent node. Continue until the child is less than parent or we have reached the root node.
  5. Repeat the entire process until there are no more nodes to check.
def consolidateHeap(L,size=-1):
   '''Consolidate nodes in the
   heap; swap where necessary (for heapsort)'''
   if size == -1:
      size = len(L)
   i = size-2
   while i > 0:
      # Check for empty nodes
      if L[i] == None:
         child = i
         j = size-1
         # Find the last non-empty node
         # in the heap and swap
         while L[j] == None and j > 0:
            j -= 1
         if (j > 0):
            swap(L,child,j)
         # Maintain heap integrity by swapping
         # parent and child nodes if necessary
         parent = int((child-1)/2)
         while parent > 0 and L[parent] < L[child]:
            swap(L,parent,child)
            child = parent
            parent = int((child-1)/2)
      i -= 1

This function is a little more complex than the others, but running our test with the consolidateHeap function inserted after rebuildHeap yields the following result:

[997, 519, 239, 144, 372, 17, 29, 79, 51, 123]
[519, 372, 239, 144, 123, 17, 29, 79, 51, None]
[372, 144, 239, 79, 123, 17, 29, 51, None, None]
[239, 144, 51, 79, 123, 17, 29, None, None, None]
[144, 123, 51, 79, 29, 17, None, None, None, None]
[123, 79, 51, 29, 17, None, None, None, None, None]
[79, 29, 51, 17, None, None, None, None, None, None]
[51, 29, 17, None, None, None, None, None, None, None]
[29, 17, None, None, None, None, None, None, None, None]
[17, None, None, None, None, None, None, None, None, None]
[None, None, None, None, None, None, None, None, None, None]

Thus, the consolidateHeap function does what it is supposed to do, and we can sort in place with heapsort. In the next article, we will use these functions together to implement heapsort.

External Links:

Heapsort on Wikipedia

Merge Sort: A Python Implementation

Merge sortIn the previous article, we introduced the concept of modules and functions, and even created a simple Python program to test for palindromes. In this article, we will continue our look at recursion and put some of the concepts introduced in previous articles into practice by coding the merge sort algorithm in Python.

For this article, we need to introduce the concept of a sorting algorithm. A common problem in computer science is the need to sort a list of items. A sorting algorithm is an algorithm that puts elements of a list into order. The most-used orders are numerical order and lexicographical order. But sorting a list efficiently has been problematic. Early search algorithms increased exponentially with the size of the list. They were on an order of n squared; that is, if the number of list items doubled, the number of list operations increased fourfold; if the list size quadrupled, the number of list operations increased sixteen-fold, and so on.

One of the simplest sort algorithms consists of simply going through a list of unsorted items sequentially, comparing the item to all remaining items, and swapping items that are out of place. We start at the first item, and compare the item to all other items, swapping the first item with the item to which we are comparing it when necessary. Once we have reached the end of the list, we know the first item of the list has the lowest value of any item on the list, and we can move on to the second item. We keep going, updating the list pointer, until we reach the end of the list. This is called a selection sort, and it is simple to implement:

def selectionSort(L):
	'''Implement a simple selection sort:
        (Not as effecient as merge sort)
	Compare each element with each subsequent
	item on the list; if they are out of place,
	swap them; keep going until we rech the end 
	of the list'''
	for i in range(0,len(L)-1):
		for j in range(i+1,len(L)):
			# If items out of place, switch
			if L[i] > L[j]:
				temp = L[i]
				L[i] = L[j]
				L[j] = temp

In this function, we have two for loops, one nested inside the other one. If the lower-indexed list member is greater than the higher-indexed one, we swap them. Although this will work, you can see the problem with selection sorts. For a list of n items, for each item, we must carry out as many comparisons as the current list position minus one. For the first list item, we have n-1 comparisons; for the second list item, n-2, and so on. There is an average of n/2 comparisons for each item, and we make n-1 passes through the list. That leaves us with (n^2-n)/2 operations. Dropping out lower terms, this means the bubble sort is on the order of O(n^2). This is far from ideal, and generally bubble sorts are only acceptable for relatively small lists.

Merge Sort: An Efficient Sorting Algorithm

One possible alternative is the merge sort algorithm. Merge sort works as follows:

  1. Divide the list into two equally-sized sublists.
  2. Apply the merge sort algorithm recursively on the sublists.
  3. Merge the two sorted lists.

Recursion requires a base case, and in merge sort, the base case is a list size of 1. In this case, the list is already sorted. On all other levels of recursion we have to merge the sorted lists. Merging the lists requires us to move through each list, placing each item into a temporary location. At each level where there is a merge operation taking place, a total of n operations are carried out. Since we are halfing the lists at each level of recursion, however, there are only log(n) levels. To confirm this is the case, let us consider the case where there are 8 items on the list:

  • First level: Divide the list into two lists of four.
  • Second level: Divide the two lists of four into four lists of two.
  • Third level: Divide the lists in half – we have reached the base case. Merge together the lists of one. There are eight lists of one to merge into four lists of two, so there are 8*1 = 8 operations.
  • Second level: Merge the four lists of two into two lists of four. There are four lists of two, so there are 4*2 = 8 operations.
  • First level: Merge the two lists of four into a single list. There are two lists of four, so there are 8 operations.

For a list of 8 items, there are 4 levels of recursion, but the fourth level is the base case, so there are 3 levels at which a merge is performed. Doubling the size of the list would add another level of recursion. Thus, the number of levels is log(n). Multiply that by the number of operations at each level and we get n log(n) operations.

This is an improvement over the selection sort, so it may be worth implementing. The pseudocode for the merge sort function is relatively easy to write:

mergesort(mylist,low,high):
if low < high
mergesort(mylist,low,low+(high-low)/2)
mergesort(mylist,low+((high-low)/2)+1,high)
merge lists into temporary location
copy merged list into mylist

As we apply the merge sort function recursively, eventually low equals high, ending the recursion. The actual Python function is slightly harder to code, but not by much:

def mergeSort(L,low=int(0),high=int(-1)):
    # Implementation of the merge sort algorithm
    # If high has default value, set it to len(L)-1
    # (index of last item on the list)
    if high == -1:
        high = len(L)-1
    if low < high:
	# Apply mergeSort on each half of the list
        mergeSort(L,low,int((low+(high-low)/2)))
        mergeSort(L,low+int(((high-low)/2))+1,high)
	# Now, merge the lists in temporary location
        i = low
        j = low+int(((high-low)/2))+1
        lb = j
        k = 0
        temp = []
        while k <= high-low:
            if i < lb and j <= high and L[i] > L[j]:
                temp.append(L[j])
                j = j + 1
            elif i < lb and j <= high and L[i] <= L [j]:  
                temp.append(L[i])                       
                i = i + 1             
            elif j > high:
                temp.append(L[i])
                i = i + 1
            elif i >= lb:
                temp.append(L[j])
                j = j + 1
            k = k + 1
        # Copy the list to its original location
        j = 0
        for i in range(low,high+1):
            L[i] = temp[j]
            j = j + 1

Our mergeSort function takes three arguments: the list itself and the lower and upper indices, indicating which subset of the list to sort. I wanted to be able to call the function with just one parameter (the list) in cases where I wanted to sort the entire list, so I added default values for low and high: the default value for low should be 0 obviously, but the value for high depends on the size of the list. Since we can’t use len(L)-1 as the default value for high (since L is itself a formal parameter), I used -1 as the default value for high since it is obviously out of bounds. If high is -1, we set it to len(L)-1. It’s a kludge, but it serves its purpose.

Next, we check to see if low is less than high. If they are equal, we have reached a base case. If not, then we call the merge sort function recursively on each half of the list. When we are done, we merge the two lists. Merging the lists entails the following:

  1. Create a temporary list to store the merged list.
  2. Create and initialize indices for both lists (i and j in our example).
  3. Compare L[i] to L[j]. Place the smaller item into the temporary list.
  4. Increment the index for the list whose item was put in the temporary list.
  5. If the end of either list has not been reached, go back to [3].
  6. Place any remaining list items in the temporary list.
  7. Copy the temporary list into L.

One of the drawbacks of merge sort is that you have to allocate temporary space to initially store the merged lists. One possible alternative is a binary heap, which we will examine in the next article.

The merge sort source code is available here.

External Links:

Sorting algorithm at Wikipedia

Merge sort at Wikipedia