Because strings are defined as ordered collections of characters, we can access their components by position. In Python, characters in a string are fetched by indexing – providing the numeric offset of the desired component in square brackets after the string. When you specify an index, you get back a one-character string at the specified position.
Strings in Python are similar to strings in the C/C++ language in that Python offsets start at 0 and end at one less than the length of the string. Unlike C, however, Python also lets you fetch items from sequences such as strings using negative offsets. Technically, a negative offset is added to the length of a string to derive a positive offset. You can also think of negative offsets as counting backward from the end. For example:
>>> a = 'party' >>> a[0], a[-2] >>> ('p', 't') >>> a[1:3], a[1:], a[:-1] ('ar', 'arty', 'part')
The first line defines a five-character string and assigns it the name a. The next line indexes it in two ways: a[0] gets the item at offset 0 from the left (the one-character string ‘p’), and a[-2] gets the item at offset 2 back from the end.
The last line in the preceding example demonstrates slicing, a generalized form of indexing that returns an entire section, not a single item. Most likely the best way to think of slicing is that it is a type of parsing, especially when applied to strings. It allows us to extract an entire section in a single step. Slices can be used to extract columns of data, chop off leading and trailing text, and more.
The basics of using slicing are fairly simple. When you index a sequence object such as a string on a pair of offsets separated by a colon, Python returns a new object containing the contiguous section identified by the offset pair. The left offset is taken to be the lower bound (which is inclusive), and the right is the upper bound (which is noninclusive). That is, Python fetches all items from the lower bound up to but not including the upper bound, and returns a new object containing the fetched items. If omitted, the left and right bounds default to 0 and the length of the object your are slicing, respectively.
For instance, in the example above, a[1:3] extracts the items at offsets 1 and 2. It grabs the second and third items, and strops before the fourth item and offset 3. Next, a[1:] gets tall the items beyond the first. The upper bound, which is not specified, defaults to the length of the string. Finally, a[:-1] fetches all but the last item. The lower bound defaults to 0, and -1 refers to the last item (noninclusive).
Indexing and slicing are powerful tools, and if you’re not sure about the effects of a slice, you can always try it out at the Python interactive prompt. You can even change an entire section of another object in one step by assigning to a slice, though not for immutables like strings.
Recent Comments