Merge Sort: A Python Implementation

Merge sortIn the previous article, we introduced the concept of modules and functions, and even created a simple Python program to test for palindromes. In this article, we will continue our look at recursion and put some of the concepts introduced in previous articles into practice by coding the merge sort algorithm in Python.

For this article, we need to introduce the concept of a sorting algorithm. A common problem in computer science is the need to sort a list of items. A sorting algorithm is an algorithm that puts elements of a list into order. The most-used orders are numerical order and lexicographical order. But sorting a list efficiently has been problematic. Early search algorithms increased exponentially with the size of the list. They were on an order of n squared; that is, if the number of list items doubled, the number of list operations increased fourfold; if the list size quadrupled, the number of list operations increased sixteen-fold, and so on.

One of the simplest sort algorithms consists of simply going through a list of unsorted items sequentially, comparing the item to all remaining items, and swapping items that are out of place. We start at the first item, and compare the item to all other items, swapping the first item with the item to which we are comparing it when necessary. Once we have reached the end of the list, we know the first item of the list has the lowest value of any item on the list, and we can move on to the second item. We keep going, updating the list pointer, until we reach the end of the list. This is called a selection sort, and it is simple to implement:

def selectionSort(L):
	'''Implement a simple selection sort:
        (Not as effecient as merge sort)
	Compare each element with each subsequent
	item on the list; if they are out of place,
	swap them; keep going until we rech the end 
	of the list'''
	for i in range(0,len(L)-1):
		for j in range(i+1,len(L)):
			# If items out of place, switch
			if L[i] > L[j]:
				temp = L[i]
				L[i] = L[j]
				L[j] = temp

In this function, we have two for loops, one nested inside the other one. If the lower-indexed list member is greater than the higher-indexed one, we swap them. Although this will work, you can see the problem with selection sorts. For a list of n items, for each item, we must carry out as many comparisons as the current list position minus one. For the first list item, we have n-1 comparisons; for the second list item, n-2, and so on. There is an average of n/2 comparisons for each item, and we make n-1 passes through the list. That leaves us with (n^2-n)/2 operations. Dropping out lower terms, this means the bubble sort is on the order of O(n^2). This is far from ideal, and generally bubble sorts are only acceptable for relatively small lists.

Merge Sort: An Efficient Sorting Algorithm

One possible alternative is the merge sort algorithm. Merge sort works as follows:

  1. Divide the list into two equally-sized sublists.
  2. Apply the merge sort algorithm recursively on the sublists.
  3. Merge the two sorted lists.

Recursion requires a base case, and in merge sort, the base case is a list size of 1. In this case, the list is already sorted. On all other levels of recursion we have to merge the sorted lists. Merging the lists requires us to move through each list, placing each item into a temporary location. At each level where there is a merge operation taking place, a total of n operations are carried out. Since we are halfing the lists at each level of recursion, however, there are only log(n) levels. To confirm this is the case, let us consider the case where there are 8 items on the list:

  • First level: Divide the list into two lists of four.
  • Second level: Divide the two lists of four into four lists of two.
  • Third level: Divide the lists in half – we have reached the base case. Merge together the lists of one. There are eight lists of one to merge into four lists of two, so there are 8*1 = 8 operations.
  • Second level: Merge the four lists of two into two lists of four. There are four lists of two, so there are 4*2 = 8 operations.
  • First level: Merge the two lists of four into a single list. There are two lists of four, so there are 8 operations.

For a list of 8 items, there are 4 levels of recursion, but the fourth level is the base case, so there are 3 levels at which a merge is performed. Doubling the size of the list would add another level of recursion. Thus, the number of levels is log(n). Multiply that by the number of operations at each level and we get n log(n) operations.

This is an improvement over the selection sort, so it may be worth implementing. The pseudocode for the merge sort function is relatively easy to write:

mergesort(mylist,low,high):
if low < high
mergesort(mylist,low,low+(high-low)/2)
mergesort(mylist,low+((high-low)/2)+1,high)
merge lists into temporary location
copy merged list into mylist

As we apply the merge sort function recursively, eventually low equals high, ending the recursion. The actual Python function is slightly harder to code, but not by much:

def mergeSort(L,low=int(0),high=int(-1)):
    # Implementation of the merge sort algorithm
    # If high has default value, set it to len(L)-1
    # (index of last item on the list)
    if high == -1:
        high = len(L)-1
    if low < high:
	# Apply mergeSort on each half of the list
        mergeSort(L,low,int((low+(high-low)/2)))
        mergeSort(L,low+int(((high-low)/2))+1,high)
	# Now, merge the lists in temporary location
        i = low
        j = low+int(((high-low)/2))+1
        lb = j
        k = 0
        temp = []
        while k <= high-low:
            if i < lb and j <= high and L[i] > L[j]:
                temp.append(L[j])
                j = j + 1
            elif i < lb and j <= high and L[i] <= L [j]:  
                temp.append(L[i])                       
                i = i + 1             
            elif j > high:
                temp.append(L[i])
                i = i + 1
            elif i >= lb:
                temp.append(L[j])
                j = j + 1
            k = k + 1
        # Copy the list to its original location
        j = 0
        for i in range(low,high+1):
            L[i] = temp[j]
            j = j + 1

Our mergeSort function takes three arguments: the list itself and the lower and upper indices, indicating which subset of the list to sort. I wanted to be able to call the function with just one parameter (the list) in cases where I wanted to sort the entire list, so I added default values for low and high: the default value for low should be 0 obviously, but the value for high depends on the size of the list. Since we can’t use len(L)-1 as the default value for high (since L is itself a formal parameter), I used -1 as the default value for high since it is obviously out of bounds. If high is -1, we set it to len(L)-1. It’s a kludge, but it serves its purpose.

Next, we check to see if low is less than high. If they are equal, we have reached a base case. If not, then we call the merge sort function recursively on each half of the list. When we are done, we merge the two lists. Merging the lists entails the following:

  1. Create a temporary list to store the merged list.
  2. Create and initialize indices for both lists (i and j in our example).
  3. Compare L[i] to L[j]. Place the smaller item into the temporary list.
  4. Increment the index for the list whose item was put in the temporary list.
  5. If the end of either list has not been reached, go back to [3].
  6. Place any remaining list items in the temporary list.
  7. Copy the temporary list into L.

One of the drawbacks of merge sort is that you have to allocate temporary space to initially store the merged lists. One possible alternative is a binary heap, which we will examine in the next article.

The merge sort source code is available here.

External Links:

Sorting algorithm at Wikipedia

Merge sort at Wikipedia

Python Modules; Introduction to Recursion

  Python moduleIf you quit the Python interpreter and enter it again without saving your program to a text file, the definitions you have made will be lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead. In doing so, we create scripts. As your program gets longer, you may also want to split it into several files for easier maintenance.

Python Modules

To support this, Python provides a way of putting definitions into a file and using them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a Python module can be imported into other modules or into the main Python modules.

A Python module is a file containing Python definitions and statements; such a file always ends with the suffix .py. Up to this point, we have not introduced Python definitions. A definition in Python always starts with def and is followed by its name. This is the equivalent of a function in C or Pascal. For example:

def hello():
        print(‘Hello, world!’)

is a very simple Python function to print out “Hello, world!”. If this code is saved in a file called hello.py, and is in Python’s path, you can import it at the Python command line:

>>> import hello

Once the Python module is imported, you can invoke the hello function with:

>>> hello.hello()

As with C and other languages, you can specify parameters. For example:

def isZero(a):
    if a == 0:
       print('a is zero')
    else:
       print('a is a non-zero number')

You can also return a value from the function:

def isZero(a):
     if a == 0:
        return True
     else:
        return False

As in other languages such as C/C++, we can specify default parameters. For example:

def isZero(a=0):

allows us to invoke the function isZero with no arguments; the interpreter will insert a value of 0 for a if no arguments are specified. However, a non-default argument cannot follow a default argument. Thus:

def isZero(a=0,b):

is not allowed, but:

def isZero(a,b=0):

is allowed.

This is a decent start, but it would be nice if we came up with a program that can do something useful.

Introduction to Recursion

In computer science, recursion is a method where the solution to a problem depends on solutions to smaller instances of the same problem. Recursion always involves the existence of one more more base cases in which an operation can be done directly on the input data. The other cases involve invoking the same function on a subset of the input data in a divide-and-conquer strategy. The approach can be applied to many types of problems – for example, finding palindromes. It shouldn’t be too difficult to come up with a Python module to solve this problem.

A palindrome is a word, phrase, number, or other sequence of symbols or elements that read the same forward or reversed: for example, “Race car”, or “A man, a plan, a canal – Panama”. A solution of the problem for finding a palindrome using recursion can be outlined as follows:

  1. If the input string length is 0 or 1, then we have a palindrome – return true
  2. If the input string length is greater than 1 but the first and last character match, apply the test recursively to the string minus the first and last characters
  3. If [1] and [2] don’t apply, then we don’t have a palindrome – return false

Successive applications of this process on the input string will eventually yield either a mismatch between the first and last character or an input string of length 0 or 1, and the test will be complete. We can code this algorithm in Python as follows:

def isPalindrome(myString):
  ''' Simple program to find palindromes, part of our Python module
  Parameters: myString => string to perform test on
  If length <= 1, it's a palindrome - return True
  If first char is the same as the last char, apply algorithm recursively
  Otherwise return False '''
  if len(myString) <= 1:
     return True
   elif myString[0].lower() == myString[len(myString)-1].lower():
     return isPalindrome(myString[1:(len(myString)-1)])
  return False
 

Our isPalindrome function takes in a single argument, myString. We introduced two hitherto unseen functions here. len() takes a single parameter – a string – and returns the length. lower() is a member of the string class and converts the string into lowercase. This ensures that our test is not case-sensitive. You may have noticed that there is one shortcoming of this algorithm: if there are spaces or any other alphanumeric content in the string, it will return false. I decided it would be easier to write a separate function to strip the non-alphanumeric characters out:

 def convertToAlphaNum(myString):
  ''' Iterate through the string and generate an output string with only the alphanumeric chars (also part of the palindrome.py Python module)
  Parameters: myString => the string to convert
  Returns: A copy of the string with all non-alphanumeric
  characters removed '''
  retval = ''
  for c in myString:
      if c.isalpha() or c.isdigit():
          retval += c
  return retval

All this function does is iterate through the input string and if a character is a letter or digit, it gets added to a new string. When it is done, the function returns the new string. We still need a function to read input from the user and call these functions, so that will be our next bit of code, and the last function in our Python module:

def testPalindrome():
 ''' Part of the palindrome. py Python module - Prompt user for string input
 Output whether it is a palindrome or not '''
 myString = str(input('Enter a string: '))
 if isPalindrome(convertToAlphaNum(myString)):
    print(myString,'is a palindrome')
 else:
    print(myString,'is not a palindrome')

This function simply prompts the user to input a string and uses the previous functions to determine whether or not it is a palindrome, and prints the results.

Once these functions are saved to a file, you can load the file into IDLE using the File -> Open menu option (or CTRL-O). The Python module will load into a separate window; from that window, select Run -> Run Module (or press F5). Then from the main IDLE window, you can run testPalindrome():

>>> testPalindrome()
Enter a string: sample text
sample text is not a palindrome
>>> testPalindrome()
Enter a string: A man, a plan, a canal – Panama
A man, a plan, a canal – Panama is a palindrome
>>> testPalindrome()
Enter a string: No ‘x’ in Nixon
No ‘x’ in Nixon is a palindrome
>>> testPalindrome()
Enter a string: No ‘x’ in Ford
No ‘x’ in Ford is not a palindrome
>>> testPalindrome()
Enter a string: Able I was, ere saw I Elba
Able I was, ere saw I Elba is a palindrome

The source code for for these functions is available as a single Python module, via this link.

It’s not the most elegant solution, but it does seem to work. In the next article, we will continue our look at programming in Python, including a second look at using recursion to solve problems.

External Links:

Python documentation from the official Python website

Python Programming: Part Three (Conditional Statements; Lists, Tuples and Dictionaries)

Python listIn the previous article, we covered variables, and how to save and run Python modules. In this article, we will introduce some more basic concepts including statements that alter the control flow of Python, and lists, tuples and dictionaries.

Up to this point, we have been executing Python statements using a linear control flow. A programming language is of minimal value, however, without conditional statements. Python provides three such statements: if, elif, and else. For example:

>>> if x < 0:
                print(‘x is a negative number’)

In this example, we used the comparison operator “<” (less than) to test to see if x is less than 0. If x is less than zero, we will print a statement that tells us x is a negative number. We might also want to print something out if x is zero or positive:

>>> if x < 0:
               print(‘x is a negative number’)
         elif x == 0:
                print(‘x equals zero’)
         else:
                print(‘x is a positive number’)

Here, the first two lines are the same, but if x is not less than zero, we test it to see if it is equal to zero and if it is, we print a message. If x is not less than zero or zero, we assume it is positive and print another message. Another useful control flow statement is the for statement. The for statement in Python differs a bit from the for statement in C and Pascal. Rather than always iterating over an arithmetic progression of numbers, as in Pascal, or giving the user the ability to define both the iteration step and halting condition as in C, Python’s for statement iterates over the items of any sequence in the order that they appear in the sequence. For example:

>>> word = ‘coffee’
>>> for c in word:
                 print(c)

This will print every letter in the string word on a separate line. If you need to iterate over a sequence of numbers, the built in function range() comes in handy. It generates arithmetic progressions:

>>> for i in range(10):
                 print(i,’ ‘,i**2)

This code will print out two columns: one containing integers 0 through 9, the other containing their squares. If you specify only a single parameter, 0 will be the lower bound and the specified parameter will be the upper bound. But you can specify a lower bound:

>>> for i in range(5,10):
                print(i,’ ‘,i**2)

This code will print out integers 5 through 9 and their squares. We can also specify a step value:

>>> for i in range(1,10,2):
                 print(i,’ ‘,i**2)

This code will print out all odd-numbered integers from 1 through 9 and their squares.

Lists, Tuples and Dictionaries

In C, there are arrays, which act as a compound data type. In Python, there are two sequence types not yet covered: lists and tuples. There are also dictionaries, which is not a sequence type, but is nonetheless very useful.

Lists are collections of some data type. Creating a list can be done quite easily:

>>> mylist = [5, 10, 15]

This will create a list with three items in it. We can iterate through the list as well:

>>> for i in mylist:
                 print(i)

This will print each item in mylist on a separate line. You can use the assignment operator on a list:

>>> otherlist = mylist

Now we will be able to access the list, but it is important to note that otherlist is not a separate copy of mylist. It also points to mylist. So this statement:

>>> otherlist[0] = 100

will alter the contents of mylist:

>>> print(mylist)
[100, 10, 15]

You can also nest a list within a list. For example:

>>> secondlist = [1, 2, 3, 4]
>>> mylist[0] = secondlist

nests secondlist inside mylist as the first element.

You can also create an empty list, by specifying an empty set of brackets in the assignment, like this:

mylist = []

Lists have a built-in method called append that can be used to append an item to the end of a list:

>>> mylist.append(125)

appends 125 to the end of the list.

Whereas a list is a mutable sequence data type, a tuple is an immutable sequence data type. A tuple consists of a number of values separated by commas. For example:

>>> tu = 1, 2, 3

A tuple need not contain all the same data type. The following statement is valid:

>>> tu = 5, 10, 15, ‘twenty’

To confirm that tuples are immutable, we can try to change one of the items:

>>> tu[0] = 121
Traceback (most recent call last):
File “”, line 1, in
TypeError: ‘tuple’ object does not support item assignment

On output, tuples are always enclosed in parentheses, so that nested tuples are interpreted correctly; they may be input with or without surrounding parenthesis, although often parenthesis are necessary anyway if the tuple is part of a larger expression. It is not possible to make an assignment to an individual item of a tuple; however, it is possible to create tuples which contain mutable objects, such as lists.

Another useful data type built into Python is the dictionary. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type. Strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples. If a tuple contains any mutable object either directly or indirectly, it cannot be used as a key. You cannot use lists as keys, since lists can be modified.

You can think of dictionaries as unordered sets of key: value pairs, with the requirement that the keys are unique within one dictionary. A pair of braces creates an empty dictionary. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary.

Here is an example of a dictionary:

>>> generals = { ‘Germany’: ‘Hindenburg’, ‘Russia’: ‘Samsonov’, ‘France’: ‘Nivelle’, ‘United Kingdom’: ‘Haig’ }
>>> print(generals[‘Germany’])
‘Hindenburg’
>>> del generals[‘Russia’]
>>> generals[‘France’] = ‘Foch’
>>> print(generals)
{‘Germany’: ‘Hindenburg’, ‘France’: ‘Foch’, ‘United Kingdom’: ‘Haig’}

You can use for to iterate through a dictionary. For example:

>>> for c in generals.keys()
print(generals[c])

Hindenburg
Foch
Haig

In the next article, we will introduce the concept of modules, and write our first function.

External Links:

Python documentation from the official Python website

Python Programming: Part Two (Python IDLE)

Python IDLEIn the previous article, we introduced some basic concepts about Python and how to use Python at the interactive command line. In this article, we will introduce variables, and also consider how to save a code module and run it, both from the command line and the Python IDLE interface.

The previous article contained the following Python statement:
>>>> a = 3+4

This was our first example of a Python variable. As you may have deduced, in Python, unlike C/C++ or some other languages, you don’t have to declare variables separately. You declare a variable when you use it for the first time. There are three distinct numeric types: integers, floating point numbers, and complex numbers. In addition, Booleans are a subtype of integers.

Numbers are created by numeric literals or as the result of built-in functions and operators. Unadorned integer literals yield integers. For example:
>>> x = 10

yields an integer variable x. Numeric literals containing a decimal point or an exponent sign yield floating point numbers. For example:
>>> x = 3.5

yields a floating point variable x.

You can also create a complex number. Appending ‘j’ or ‘J’ to a numeric literal yields an imaginary number to which you can add an integer or float to get a complex number with real and imaginary parts. For example:
>>> x = 10j + 5

creates a complex number with 5 as the real part and 10 as the imaginary part. You can retrieve both parts by typing:
>>> print(x)

or just the real or imaginary parts:
>>> print(x.real)
>>> print(x.imag)

The variables real and imag, however, are read-only variables and cannot be used to change the values of the real and imaginary components. The constructors int(), float() and complex() can be used to produce numbers of a specific type. For example:
>>> x = int(23)

creates an integer with a value of 23. Interestingly, leaving the parameter blank like this:
>>> x = int()

results in the value 0 being assigned to x.

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points. String literals can be written in several different ways – single quoted, double quoted, or triple quoted:
>>> text = ‘This is a test’
>>> text = “This is a test”
>>> text = ”’This is a test”’
>>> text = “””This is a test”””

Strings are immutable sequences of Unicode code points. Therefore:
>>> text[0] = ‘B’

is not allowed. However, if we want to print a single character from the string, we can do this:
>>> print(text[0])

or, if we want to print multiple characters that represent a subset of the string, we can specify two indices separater by a colon. For example:
>>> print(text[0:3])

will result in ‘This‘ being printed.

So far, we have been typing in programs at the interactive prompt, which has one big disadvantage: programs you type there go away as soon as the Python interpreter executes them. Because the code typed in interactively is never stored in a file, you cannot run it again without retyping it. To save programs permanently, you need to write your code in files, which are usually called modules. Modules are text files containing Python statements. Once a module is coded and saved in a file, you can ask the Python interpreter to execute the module any number of times.

To code a module, open your favorite text editor (e.g. Notepad in Windows, perhaps vi, emacs or gedit in Linux) and type some statements into a new text file called module01.py:

# My first Python script
name = str(input(‘Please enter your name: ‘)) # Read some input
print(‘Welcome to Python, ‘+name) # Print something out

In this module, we introduced three new concepts. The first line of the module is a comment. A hashtag (#) introduces a comment, which is just a description of what the program is doing. Comments can appear on a line by themselves or to the right of a statement. Multiline comments can be added, but must begin and end with three single quote marks, like this:
”’This is an example
 of a multi-line comment that can be inserted into a Python script on the command line or in the Python IDLE interface”’

The other concept we introduced was the input function. input() simply reads characters from the standard input device (in this case, a keyboard). We used the str() constructor to ensure that name is created as a string.

The third concept we introduced was the plus sign (+) to concatenate the literal string ‘Welcome to Python, ‘ and name. As you probably guessed, the program will read in a string from the keyboard representing the user’s name, and will print out a welcome message containing the user’s name.

Once you have saved this text file, you can ask Python to run it by listing its full filename as the first argument to a python command, typed at the system command prompt; e.g.:
> python module01.py

Using the Python IDLE Interface to Run a Module

Python IDLE

Python IDLE interface running our simple script.

Alternatively, from the Python IDLE interface, you can navigate to File -> Open, and then browse to module01.py and open it. A new window should appear with the code for the module in it. Navigate to Run -> Module (or in Windows, just press F5) to run the module, and the output should appear in the main IDLE window, after it prints the “RESTART” banner message:

Please enter your name: Grumbledook
Welcome to Python, Grumbledook

There are other ways we can load a module. We could use the import command, assuming the module is within Python’s path:

>>> import module01

If we make changes to the code, we would have to use the reload command in order for them to take effect:

>>> from imp import reload
>>> reload(module01)

Finally, we could use the exec() and open.read() commands in conjunction to simultaneously load and run the module:

>>> exec(open(‘module01.py’).read())

These three options should work both at the command line and in the Python IDLE interface.

In the next article, we will introduce some additional concepts and also code our first Python function.

External Links:

Wikipedia article on Python IDLE interface

Python IDLE website

Python IDLE wiki

How to Install Python IDLE on Linux

How to Install Python IDLE on eHow

How to Install Python

Install Python

Running the Python command line.

If you want to run a Python script and you don’t have Python on your system already, fear not. There are a number of options available to install Python, covering a wide variety of different platforms.

How to Install Python: From python.org

Probably the easiest way to install Python is to go to the official Python site’s download page (http://www.python.org/downloads/). Select the release version you want to download, and select the appropriate operating system. The source is also available as a gzipped tarball. If you are running Linux, you may already have Python installed. Ubuntu, for example, includes Python by default. If not, you should be able to install Python from the repositories by typing:

sudo apt-get install python

When prompted, type the superuser password, and installation will begin.

Install Python

IDLE, the official Python GUI.

Once Python is installed, you should be able to invoke the Python interpreter at the command line by typing “python“. Alternatively, you can create a shortcut for Python on your desktop and click on the icon. Clicking on “pythonw” should start IDLE, the Python GUI. You may, however, have problems running IDLE on certain versions of Windows. I myself could not start pythonw under Windows 8.1. If you have the same problem, you might consider using an alternative method for starting IDLE:

  1. Start the Python command line. You can do this [1] by launching a command prompt and typing “python“; [2] by double-clicking “python.exe” in the File Explorer; [3] by right-mouse clicking on the Windows logo in the lower left corner of the desktop, selecting “Run” and launching python.exe from “Run“, or [4] creating a desktop icon for Python and clicking on it.
  2. In the Python interpreter, type “import idlelib.idle” and press <ENTER>. IDLE should now launch.

Whether you are using the basic Python command interpreter or IDLE, you can start typing in Python code. Typing and executing the Python Hello World program is almost trivial:

>>> print(‘Hello, world!’)
Hello, world!

As you can see, the Python interpreter runs the program without the need to compile the code, one of the advantages of using Python. Using IDLE provides the following advantages over the standard Python shell:

  • The ability to save and load Python source code files
  • The ability to cut, copy and paste text, and search
  • The ability to debug programs, using IDLE’s debugger and stack viewer

How to Install Python: The Eclipse IDE

Install Python

Using the PyDev plugin in the Eclipse IDE to write and run a Python script.

Many users will likely find the stock Python installation more than adequate. For those looking for a more elegant solution, however, I suggest the Eclipse IDE. Eclipse is an integrated development environment originally developed by IBM Canada, and currently being developed by the Eclipse Foundation. It contains a base workspace and an extensible plug-in system for customizing the environment. The base system is only useful for Java development, but with plug-ins, you can extend Eclipse to develop applications in other programming languages, including C/C++, COBOL, JavaScript, Perl, PHP – and Python.

The official Eclipse website can be found at eclipse.org; here you can download the latest version. As of this writing, the current version is 4.4, and there are versions for Windows (32 and 64-bit), Linux and Mac OS X. Under Linux, you can also install Eclipse from the repositories by typing:

sudo apt-get install eclipse

and typing the superuser password when prompted.

Once Eclipse is installed, you still need to install the Python plugin. I use the PyDev Python IDE for Eclipse, which you can find at: http://marketplace.eclipse.org/content/pydev-python-ide-eclipse. Some of the features of PyDev include:

  • Python, Jython and IronPython support
  • Django integration
  • Code completion
  • A debugger
  • Code analysis

To install PyDev, launch Eclipse, and from the top menu, navigate to Help -> Install New Software to launch the Install New Software dialog box. In the “Work with: ” edit box at the top, type:

PyDev and PyDev Extensions – http://pydev.org/updates

and press the “Add” button. “PyDev” and “PyDev Mylyn Integration (optional)” should appear in the list box. Check the software packages you want to install (PyDev is required if you want to use PyDev to use Python with Eclipse; Mylyn is an optional application lifecycle management framework that serves as a task management tool for Eclipse). When you have made your selection, press the “Next” button; it should take a minute for Eclipse to calculate file sizes and dependencies. Review the information on the next page and press “Next” again. On the subsequent page, click on the radio button to agree to the software license terms and press “Finish“. PyDev should now be installed.

Now, you should be able to start a Python project in Eclipse by navigating to File -> New -> PyDev Project. You need to give your project a name, and select a grammar version (default is 2.7), but it should be easy to create a project. Once you have created a project, you can right-mouse click on the project, and select New -> File to create a new Python source code file that will be added to the project.

Although I covered how to install Python using the stock Python installation as well as Eclipse in this article, I barely scratched the surface in terms of what is available for Python development. I prefer Eclipse, but it may behoove you to do your own research and find the tools which best fit your needs.

External Links:

The official Python site

The official Eclipse site

Are Python Scripts Programs?

Python scriptWe have often heard Python referred to as a scripting language; in fact, often the terms “Python program” and “Python script” used interchangeably. So is Python a true scripting language?

Python Script: What a Scripting Language Is

In order to answer this question, we must first determine what a scripting language is. A scripting language is a language that incorporates support for scripts, and by scripts we mean programs written for a run-time environment (as opposed to programs that are compiled) and that automate the execution of tasks that could otherwise be executed one-by-one by a human operator. Examples of environments that incorporate support for scripting include software applications (programs like Excel and mIRC come to mind), web pages within a web browser, the shells of operating systems, and embedded systems. A scripting language can be viewed as a domain-specific language for a specific environment. Scripting languages can also be viewed as very high-level programming language (as opposed to a low-level programming language such as Assembly), as they operate at a high level of abstraction, or as control languages, particularly for job control languages on mainframes. Thus, we have at least two somewhat distinct ideas of what a scripting language is: [1] a run-time language that only works in a specific environment, such as a software application, or [2] a “glue” or control layer used to control and direct other application components. These categories are not mutually exclusive, but hopefully, the distinction will serve its purpose. To cover the first case, Python programs can indeed serve the role of a domain-specific language, and therefore such a program could be called a Python script. Python programs can be launched from console command lines and perform such tasks as processing text files and launching other programs, which is what we would expect a shell script to be able to do. But this is just one of dozens of common Python application domains, and Python is more than just a shell-script language. To cover the second case, Python programs are indeed often used as the control layer. To test hardware devices, Python programs may call out to components that give low-level access to a device; thus, changes made to the low-level drivers need not affect the control layer. In addition, programs may run Python code at certain points to support end-user product customization. This way, software can be tailored to a customer’s needs without having to recompile and ship the entire system’s source code. However, this is just a common Python role and a subset of what Python can do. Although we could call such code a Python script, Python is more than just a control language. Another sense in which we think of a scripting language is as a simple language for quickly coding tasks. In this sense, perhaps we can think of Python as a scripting language and a program as a Python script. It allows much faster program development than compiled languages like C/C++. But again, Python is not just for simple tasks, and programs can scale up in sophistication as required. That brings us back to the question posed in the beginning of this article: is Python a scripting language? Clearly it can be used as a scripting language, and it supports a rapid and flexible mode of development, but it is much more than just a scripting language. In future articles, I will explore some of the basics of writing a Python script, while also unleashing some of the more powerful functionality of this programming language.

External Links:

Python documentation at python.org