Python Object Types - Numbers, Strings, and None
In Python data takes the form of objects either built-in objects that Python provides, or objects we create using Python or external language tools such as X extension libraries. Because objects are the most fundamental notion in Python programming, we'll start with built-in object types.
Object Type | Example literals/creation |
---|---|
Numbers | 1234, 3.1415, 3+4j, Decimal, Fraction |
Strings | 'python', "Jupiter's", b'a\x01c' |
Lists | [1, [2, 'three'], 4] |
Dictionaries | {'Apple': 'iPhone', 'Google': 'Android'} |
Tuples | {1, 'php', 3, 'Y'} |
Files | myFile = open('java', 'r') |
Sets | set('xyz'), {'x', 'y', 'z'} |
Other core types | Booleans, types, None |
Program unit types | Functions, modules, classes |
Implementation related types | Compiled code, stack tracebacks |
There are no type declarations in Python. The syntax of the expressions we run determines the types of object we create and use. In Python, every value has a datatype, but we don't need to declare the datatype of variables. How does that work? Based on each variable's original assignment, Python figures out what type it is and keeps tracks of that internally.
Of course, there are more types than those in the table above. Everything is an object in Python, so there are types like module, function, class, method, file, and even compiled code.
Once we create an object, we bind its operation set for all time. We can perform only string operations on a string and list operations on a list. Python is dynamically typed, however, it is also strongly typed, i.e., we can perform on an object only operations that are valid for its type.
Everything in Python is an object, and everything can have attributes and methods. All functions have a built-in attribute __doc__, which returns the docstring defined in the function's source code. For example, sys module is an object which has an attribute called path, and so forth.
What is an object?
Different programming languages define object in different ways. In some, it means that all objects must have attributes and methods; in others, it means that all objects are subclassable. In Python, the definition is looser. Some objects have neither attributes nor methods, but they could. Not all objects are subclassable. But everything is an object in the sense that it can be assigned to a variable or passed as an argument to a function
You may have heard the term first-class object in other programming contexts. In Python, functions are first-class objects. You can pass a function as an argument to another function. Modules are first-class objects. You can pass an entire module as an argument to a function. Classes are first-class objects, and individual instances of a class are also first-class objects.
This is important, so I'm going to repeat it in case you missed it the first few times: everything in Python is an object. Strings are objects. Lists are objects. Functions are objects. Classes are objects. Class instances are objects. Even modules are objects.
This section on object is from "Dive into Python 3" by Mark Pilgrim.
Python 3.0's integer type automatically provides extra precision for large numbers as shown below.
>>> >>> 2 ** 100 1267650600228229401496703205376 >>> # How many digits? >>> len(str(2 ** 100)) 31 >>>
Let's move on to floating numbers.
IDLE 2.6.2
>>> >>> # repr: as code >>> 3.1415 *2 6.2830000000000004 >>> # str: user-friendly >>> print(3.1415 * 2) 6.283 >>>
There are two ways to print every object: with full precision and in a user-friendly form.
The first form is known as an object's as code repr, and the second is its user-friendly str. But in 3.2, we do not see the difference in the output:
IDLE 3.2.a3
>>> 3.1415 * 2 6.283 >>> print(3.1415 *2) 6.283 >>>
There are useful numeric modules that ship with Python:
>>> >>> import math >>> math.pi 3.141592653589793 >>> math.sqrt(1000) 31.622776601683793 >>> >>> >>> import random >>> random.random() 0.4245390260050892 >>> random.choice([1,2,3,4,5]) 3 >>>
The math module contains advanced numeric tools as functions, while the random module performs random number generation and random selections.
A sequence is an ordered collection of objects. Sequences maintain a left-to-right order among the items. Their items are stored and fetched by their relative position. Actually, strings are sequences of one-character strings. Other types of sequences include lists and tuples.
As sequences, strings support operations that assume a positional ordering among items. We can verify its length with the built-in len function and fetch its components with indexing expressions:
>>> S = 'Picasso' >>> len(S) 7 >>> S[0] 'P' >>> S[1] 'i' >>>
We can index backward, from the end. Positive indexes count from the left, and negative indexes count back from the right:
>>> S[-1] 'o' >>> S[-2] 's' >>> S[len(S)-1] 'o' >>>
Actually, a negative index is simply added to the string's size.
Sequences also support a more general form of indexing known as slicing. It is a way to extract an entire section (slice) in a single step:
>>> S 'Picasso' >>> S[1:4] 'ica' >>>
The general form,X[I:J], means give me everything in X from offset I up to but not including offset J. The result is returned in a new object. The second of the operations gives us all the characters in string S from offsets 1 through 3 (which is 4-1) as a new string. The effect is to slice or parse out the two characters in the middle.
In a slice, the left bound defaults to zero, and the right bound defaults to the length of the sequence being sliced. This leads to some common usage variations:
>>> >>> S[1:] 'icasso' >>> # Everything past the first(1:len(S)) >>> S[1:] 'icasso' >>> # S itself hasn't changed >>> S 'Picasso' >>> # Everything but the last >>> S[0:6] 'Picass' >>> # Same as S[0:6] >>> S[:6] 'Picass' >>> # Everything but the last again, but simpler (0:-1) >>> S[:-1] 'Picass' >>> # All of S as a top-level copy(0:len(S)) >>> S[:] 'Picasso' >>>
As sequences, strings also support concatenation with the plus sign by joining two strings into a new string and repetition by making a new string by repeating another:
>>> S 'Picasso' >>> S + "'s painting" "Picasso's painting" >>> # S is not changed >>> S 'Picasso' >>> # Repetition >>> S * 3 'PicassoPicassoPicasso' >>>
Every string operation produces a new string as its result. This is because strings are immutable in Python. They cannot be changed in-place after they are created. For instance, we can't change a string by assigning to one of its positions, but we can always build a new one and assign it to the same name. Because Python cleans up old objects as we go, this isn't as inefficient as it may sound:
>>> >>> S 'Picasso' >>> >>> # Immutable objects cannot be changed >>> S[0]='X' Traceback (most recent call last): File "", line 1, in S[0]='X' TypeError: 'str' object does not support item assignment >>> # >>> # But we can run expressions to make new objects >>> S = 'X' + S[1:] >>> S 'Xicasso' >>>
Every object in Python is classified as either immutable or not. In terms of the core types, numbers, strings, and tuples are immutable; lists and dictionaries are not. Among other things, immutability can be used to guarantee that an object remains constant throughout our program.
The string find method is the basic substring search operation, and the string replace method performs global searches and replacements:
>>> S = 'Picasso' >>> S 'Picasso' >>> >>> # Find the offset of a substring >>> S.find('ss') 4 >>> S 'Picasso' >>> # Replace occurrences of a substring with another >>> S.replace('ss','tt') 'Picatto' >>> S 'Picasso' >>>
Despite the names of these string methods, we are not changing the original strings here, but creating strings as the results since string are immutable. String methods are the first line of text-processing tools in Python. Other methods split a string into substrings on a delimiter, perform case conversions, test the content of the string, and strip whitespace characters off the ends of the string:
>>> >>> line ='aaa,bbb,ccccc,dd' >>> # Split on a delimiter into a list of substrings >>> line.split(',') ['aaa', 'bbb', 'ccccc', 'dd'] >>> >>> >>> S ='Picasso' >>> >>> # Upper- and lower case convedrsion >>> S.upper() 'PICASSO' >>> # Content tests: isalpha, isdigit, etc. >>> S.isalpha() True >>> >>> line = 'aaa,bbb,ccccc,dd\n' >>> # Remove whitespace characters on the right side >>> line = line.rstrip() >>> line 'aaa,bbb,ccccc,dd' >>>
Strings also support an advance substitution known as formatting, available as both an expression and a string method call:
>>> >>> # Formatting expression >>> '%s, Spain, and %s' % ('Picasso','Painting!') 'Picasso, Spain, and Painting!' >>> # Formatting method >>> '{0}, Spain, and {1}'.format('Pacasso', 'Painting!') 'Pacasso, Spain, and Painting!' >>>
Note that although sequence operations are generic, methods are not. Although some types share some method names, string method operations generally work only on strings, and nothing else. As a rule of thumb, Python's toolset is layered: generic operations that span multiple types show up as built-in functions or expressions (e.g., len(X), X(0)). but type-specific operations are method calls (e.g., aString.upper()).
To get more information about object methods, we can always call the built-in dir function. It returns a list of all the attributes available for a given object. Because methods are function attributes, they will show up in this list. Assuming S is still the string, here are its attributes on Python 3.2:
>>> S = 'Picasso' >>> dir(S) ['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] >>>
Here, the names with underscores in the list represent the implementation of the string object and are available to support customization. In general, leading and trailing double underscores is the naming pattern Python uses for implementation details. The names without the underscores in the list are the callable methods on string objects.
The dir function simply gives the methods' names. To get the information about what they do, we can pass them to the help function:
>>> help(S.replace) Help on built-in function replace: replace(...) S.replace(old, new[, count]) -> str Return a copy of S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced. >>>
help is one of interfaces to a system of code that ships with Python known as PyDoc which is a tool for extracting documentation from objects.
Python provides several ways for us to code strings. For example, special characters can be represented as backslash escape sequences:
>>> # \n is end-of-line, \t is tab >>> S = 'A\nB\tC' >>> # Each stands for just one character >>> len(S) 5 >>> # \n is a byte with the binary value 10 in ASCII >>> ord('\n') 10 >>> >>> # \0, a binary zero byte, does not terminate string >>> S = 'A\0B\0C' >>> len(S) 5 >>>
Python allows strings to be enclosed in single or double quote characters. It also allows string literals with multiline enclosed in triple quote. When this form is used, all the lines are concatenated together and end-of-line characters are added where line breaks appear. This is useful for embedding things like HTML and XML code in a Python script:
>>> >>> msg = """abc def'''ghi""jkl'mn opqrst""" >>> msg 'abc\ndef\'\'\'ghi""jkl\'mn\nopqrst' >>>
None is a special constant in Python. It is a null value. None is not the same as False. None is not 0. None is not an empty string. Comparing None to anything other than None will always return False.
None is the only null value. It has its own datatype (NoneType). We can assign None to any variable, but you can not create other NoneType objects. All variables whose value is None are equal to each other.
>>> type(None)>>> None == 0 False >>> None == '' False >>> None == False False >>> None == None True >>> a = None >>> a == None True >>> b = None >>> a == b True
Another thing to note is when we use None in a boolean context:
>>> def None_in_a_boolean_context(None_Input): if None_Input: print("It is True") else: print("It is False") >>> None_in_a_boolean_context(None) It is False >>> None_in_a_boolean_context(not None) It is True
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization