Python Strings: Part Six

Python stringsIn the previous article, we began our look at indexing and slicing. In this article, we will continue our look at slicing and show some practical applications of slicing.

In Python 2.3 and later, there is support for a third index, used as a step. The step is added to the index of each item extracted. The three-index form of a slice is X[I:J:K], which means “extract all the items in X, from offset I through J-1, by K.” The third limit, K, defaults to 1, which is why normally all items in a slice are extracted from left to right. But if you specify an explicit value, you can use the third limit to skip items or to reverse their order.

For instance, a[1:10:2] will fetch every other item in X from offsets 1-9; that is, it will collect the items at offsets 1, 3, 5, 7 and 9. As usual, the first and second limits default to 0 and the length of the sequence, respectively, so a[::2] gets severy other item from the beginning to the end of the sequence:

>>> a = 'nowisthetimeto'
>>> a[1:10:2]
>>> 'oitei'

You can also use a negative stride. For example, the slicing expression “every”[::-1] returns the new string “yreve” – the first two bounds default to 0 and the length of the sequence, as before, and a stride of -1 indicates that the slice should go from right to left instead of the usual left to right. The effect is to reverse the sequence:

>>> a = 'every'
>>> a[::-1]
'yreve'

With a negative stride, the meanings of the first two bounds are essentially reversed. That is, the slice a[5:1:-1] fetches the items from 2 to 5, in reverse order (the result contains items from offsets 5, 4, 3, and 2):

>>> a = 'thequick'
>>> a[5:1:-1]
'iuqe'

Skipping and reverse like this are the most common use cases for three-limit slices, but see Python’s standard library manual for more details.

Slices have many applications. For example, argument words listed on a system command line are made available in the argv attribute of the built-in sys module:

#File command.py - echo command line args
import sys
print(sys.argv)

% python command.py -1 -2 -3
['command.py', '-1', '2', '3']

Usually, however, you’re only interested in inspected the arguments that follow the program name. This leads to a typical application of slices: a single slice expression can be used to return all but the first item of a list. Here, sys.argv[1:] returns the desired list, [‘-1’, ‘-2’, ‘-3’]. You can then process this list without having to accommodate the program name at the front.

External Links:

Strings at docs.python.org

Python Strings at Google for Developers

Python strings tutorial at afterhoursprogramming.com

Python Strings: Part Five

Python stringsBecause strings are defined as ordered collections of characters, we can access their components by position. In Python, characters in a string are fetched by indexing – providing the numeric offset of the desired component in square brackets after the string. When you specify an index, you get back a one-character string at the specified position.

Strings in Python are similar to strings in the C/C++ language in that Python offsets start at 0 and end at one less than the length of the string. Unlike C, however, Python also lets you fetch items from sequences such as strings using negative offsets. Technically, a negative offset is added to the length of a string to derive a positive offset. You can also think of negative offsets as counting backward from the end. For example:

>>> a = 'party'
>>> a[0], a[-2]
>>> ('p', 't')
>>> a[1:3], a[1:], a[:-1]
('ar', 'arty', 'part')

The first line defines a five-character string and assigns it the name a. The next line indexes it in two ways: a[0] gets the item at offset 0 from the left (the one-character string ‘p’), and a[-2] gets the item at offset 2 back from the end.

The last line in the preceding example demonstrates slicing, a generalized form of indexing that returns an entire section, not a single item. Most likely the best way to think of slicing is that it is a type of parsing, especially when applied to strings. It allows us to extract an entire section in a single step. Slices can be used to extract columns of data, chop off leading and trailing text, and more.

The basics of using slicing are fairly simple. When you index a sequence object such as a string on a pair of offsets separated by a colon, Python returns a new object containing the contiguous section identified by the offset pair. The left offset is taken to be the lower bound (which is inclusive), and the right is the upper bound (which is noninclusive). That is, Python fetches all items from the lower bound up to but not including the upper bound, and returns a new object containing the fetched items. If omitted, the left and right bounds default to 0 and the length of the object your are slicing, respectively.

For instance, in the example above, a[1:3] extracts the items at offsets 1 and 2. It grabs the second and third items, and strops before the fourth item and offset 3. Next, a[1:] gets tall the items beyond the first. The upper bound, which is not specified, defaults to the length of the string. Finally, a[:-1] fetches all but the last item. The lower bound defaults to 0, and -1 refers to the last item (noninclusive).

Indexing and slicing are powerful tools, and if you’re not sure about the effects of a slice, you can always try it out at the Python interactive prompt. You can even change an entire section of another object in one step by assigning to a slice, though not for immutables like strings.

External Links:

Strings at docs.python.org

Python Strings at Google for Developers

Python strings tutorial at afterhoursprogramming.com