Classes and Inheritance: Part Two

classesIn the previous article, we introduced Python classes, discussed some of their features, and some of the similarities and differences from C++ classes. In this article, we will use our knowledge of C++ classes to rewrite our hash functions as a Python class.

Writing the HashObject Class

Our first job is to write the class constructor. Since it makes sense to initialize our list when the class is first instantiated, I decided to incorporate the createBuckets() function as part of the constructor:

class HashObject(object):
    def __init__(self, num):
        self.numBuckets = num
        self.hashSet = []
        for i in range(num):
            self.hashSet.append([])

This constructor takes one parameter (num), and assigns it to numBuckets, and then initializes our list.

Next, we want to define hashElement, which takes a number and hashes it to an index in hashSet. This function is the same as before, except that we do not have to pass the number of buckets to it:

def hashElement(self, elem):
        if type(elem) == int:
            return elem%self.numBuckets

The new insert function requires only one parameter: the list item to be inserted into one of the buckets:

def insert(self, i):
        self.hashSet[self.hashElement(i[0])].append(i)

Note that it is assumed that the first element of the list to be stored is assumed to be the key. Next, we need a remove function. We will take one parameter: the key value of the item to be deleted:

def remove(self, key):
        newElement = []
        for j in self.hashSet[self.hashElement(key)]:
            if j[0] != key:
                newElement.append(j)
        self.hashSet[self.hashElement(key)] = newElement    

Finally, we need to write the function to check to see if an item with a certain key exists:

def member(self, key):
        return i in self.hashSet[hashElement(key)]

One additional member function we may want to define is an accessor function for hashSet so we do not have to directly access the list, a good practice, since data hiding is one of the objectives of object-oriented programming:

def getElement(self, key):
        for j in self.hashSet[self.hashElement(key)]:
            if j[0] == key:
                return j

As you can see, we iterate through each list item in the bucket the key value hashes to (multiple items may be stored in a single bucket). When we find an item whose first value equals the key, we return that item.

With variables and member functions for HashObject defined, we can rewrite hashTest to test our class:

def hashTest():
    h = HashObject(51)
    h.insert([2175,'Homer Simpson','Technician',20])
    print(h.getElement(2175))
    h.insert([4158,'Elmer Higgins','Engineer',17])
    print(h.getElement(4158))
    h.insert([2583,'Waylon Smithers','Assistant',25])
    print(h.getElement(2583))
    h.remove(2175)
    print(h.getElement(2175))
    print(h.getElement(2583))

Running this function results in the following output:

[2175, ‘Homer Simpson’, ‘Technician’, 20]
[4158, ‘Elmer Higgins’, ‘Engineer’, 17]
[2583, ‘Waylon Smithers’, ‘Assistant’, 25]
None
[2583, ‘Waylon Smithers’, ‘Assistant’, 25]

In this function, we first create a hash set with 51 buckets. We insert the first record (Homer Simpson’s), and print it out, confirming its successful insertion. We insert a second record, also printing it out to confirm its insertion. Then we enter a third item (which, incidentally, hashes to the same location as the first item), and print it out, confirming its insertion and also confirming that we can successfully iterate through a bucket. Finally, we remove the first record and try to print it out. Since the getElement function will not return a value for a nonexistent record, the print call prints out “None”, for no value. Finally, we successfully print out the third record.

I have made HashObject and HashTest available for download here, so you can download it, run it and modify it to your heart’s content.

External Links:

Classes Tutorial on python.org

Class (computer programming) on Wikipedia

Classes and Inheritance: Part One

classesIn the previous article, we introduced some hashing functions, but had to pass the number of buckets (numBuckets) to each function, which was not an ideal solution. In this article, we introduce a programming construct that will provide a better way of working with these hashing functions: classes.

Classes Defined

A class is an extensible program-code-template for creating objects, providing initial values for member variables and implementations of functions. When an object is created by a constructor of the class, the resulting object is called an instance of the class, and the member variables specific to the object are called instance variables (unlike class variables, which are variables that are shared across the class).

Classes are essentially combinations of variables and functions. We use the word “encapsulation” to refer to this combination. The variables either belong to the class or specific instances of the class. The functions are subroutines with the ability to operate on objects or classes. Many kinds of functions exist. In most programming language, there are constructors (functions automatically called when an instance of the class is created), destructors (functions automatically called when an instance of the class is deleted or otherwise destroyed), and conversion operators (functions that define how certain operators, e.g. the equality operator, work with instances of the class).

Classes: Implementation in Python

In Python, the simplest form of a class definition looks like this:

class HashObject:
	
	...
	

Class definition must precede any instantiation of the class to have any effect. The statements inside a class definition will usually be function definitions, but other statements are allowed. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling convention for functions.

Class objects support attribute references and instantiation. Attribute reference use the standard syntax used for all attribute references in Python: obj.name. If the class definition looked like this:

class HashObject:
	i = 123
	def myfunc(self)
		return 'A simple function definition'

then HashObject.i and HashObject.myfunc are valid attribute references. Class attribute can be assigned, so you can change the value of HashObject.i by assignment.

Class instantiation uses function notation; just pretend that the class object is a parameterless function that returns a new instance of the class. For example,

h = HashObject()

creates a new instance of the class HashObject and assigns this object to the local variable h.

The instantiation operation creates an empty object. Many classes like to create objects with instances customized to a specific initial state. Therefore, a class may define a special method, called a constructor. Here is an example:

def __init__(self):
	self.data = []

When a class defines an __init__() method, class instantiation automatically invokes __init__() for the newly-created class instance. In the case of HashObject, a new, initialized instance can be obtained by:

h = HashObject()

The __init__() method may have arguments for greater flexibility. In that case, arguments given to the class instantiation operator are passed on to init. For example:

class HashObject:
   def __init__(self, num):
      self.numBuckets = num

>>> h = HashObject(51)
>>> x.numBuckets
51

With instance objects, there are two kinds of valid attribute names, data attributes and methods. Data attributes correspond to data members in C++. Like local variables, they do not need to be declared and spring into existence when they first have a value assigned to them. For example, if we created an instance of HashObject called h, we can create and print out the value of x like so:
>>> h.x = 123
>>> print(‘h.x = ‘,h.x)
123
The other kind of instance attribute is a method. A method is a function that belongs to an object. Valid method names of an instance object depend on its class. By definition, all attributes of a class that are function objects define corresponding methods of its instances.

Usually, a method is called right after it is bound. Suppose we add a function definition to HashObject:

class HashObject:
    def test(self):
        print('Hello, world!')

We can call the method in object h like so:
>>> h.test()
Hello, world!

Notice that h.test() was called without an argument, even though the function definition for test() specifies an argument. This is where Python deviates from C++. In this example, the call h.test() is exactly equivalent to HashObject.test(h). In general calling a method with a list of n arguments is the same as calling the corresponding function with an argument list that is created by inserting the method’s object before the first argument.

Python classes also supports inheritance. The syntax for a derived class definition looks like this:

class DerivedClass(BaseClass):
	
	...
	

When the class object is constructed, the base class is remembered. If a requested attribute is not found in the class, the search proceeds to look in the base class. This rule is applied recursively if the base class itself is derived from some other class.

If we create an instance of DerivedClass, method references are resolved as follows: the corresponding class attribute is searched, descending down the chain of base classes if necessary, and the method reference is valid if this process yields a function object. Derived classes may also override methods of their base classes. Because methods have no special privileges when calling other methods of the same object, a method of a base class that calls another method defined in the same base class may end up calling a method of a derived class that overrides it. This is similar to virtual functions in C++, where a function is left undefined in the base class but defined in one or more derived classes.

Python also supports a form of multiple inheritance as well. A class definition with multiple base classes looks like this:

class DerivedClass(BaseClass1, BaseClass2, BaseClass3)
	
	...
	

The search for attributes is depth-first, left to right, so if an attribute is not found in the derived class, it is searched for in BaseClass1, then the base classes of BaseClass1, then BaseClass2, and so on. It should be mentioned that the multiple inheritance method resolution in Python uses a dynamic algorithm to linearize the search order, both preserving the search order, and ensuring that each parent is called only once.

One significant way in which Python classes differ from C++ classes is that there is no means of declaring functions or variables to be private or protected. Thus, there are no private variables in Python. The closest you’ll get are variables that start with a double underscore, like __variable. Such a variable cannot be directly referenced outside the class definition, at least not directly. Thus, if we have something like this:

class HashObject:
	def __init__(self, num):
		__numBuckets = num

and we create an instance of HashObject:

>>> h = HashObject(47)

We can’t access __numBuckets directly:

>>> print(h.__numBuckets)

will result in an error message.

Now that we have a solid foundation for understanding Python classes, we should be able to rewrite our hash functions from the previous article as a class. We will do that in the next article.

External Links:

Classes Tutorial on python.org

Class (computer programming) on Wikipedia