Introduction
Creating Functions
>>> def double(x):
... return x * 2
...
>>> print double(5)
10
>>> print double(["basalt", "granite"])
['basalt', 'granite', 'basalt', 'granite']
Returning Stuff
>>> def sign(x):
... if x < 0:
... return -1
... if x == 0:
... return 0
... return 1
Overuse of multiple returns can make your code less readable though. In general, it is best to
- use early returns at the beginning of the function to handle special cases
- add one at the end to handle the general case
>>> def hello():
... print "HELLO"
...
>>> def world():
... print "WORLD"
... return
...
>>> print hello()
HELLO
None
>>> print world()
WORLD
None
One final note on returning values from functions: be consistent with what function return. If a function can return None, a list or a string, the caller will have to use an extra if statement to properly handle the returned value.
Scope
# Global variable.
rock_type = 'unknown'
# Function that creates local variable.
def classify(rock_name):
if rock_name in ['basalt', 'granite']:
rock_type = 'igneous'
elif rock_name in ['sandstone', 'shale']:
rock_type = 'sedimentary'
else:
rock_type = 'metamorphic'
print 'in function, rock_type is', rock_type
# Call the function to prove that it uses its local 'x'.
print "before function, rock_type is", rock_type
classify('sandstone')
print "after function, rock_type is", rock_type
When Python accesses a variable, it checks the top stack frame to see if it knows about that variable. If it does then it uses that value. If not, then it checks whether a global variable with that name exists. The result of our script is then
before function, rock_type is unknown
in function, rock_type is sedimentary
after function, rock_type is unknown
Passing Parameters
As was discussed last lesson, when you slice a list what you get is a new list containing the sliced elements of the original list. So to create a copy of a list what you want to do is slice out all the values. In order to do that you use the slice notation, but without a beginning or ending index: [:]. This is the equivilant of [0:len(list)]. This script will help illustrate these two points.
def add_salt(first, second):
first += "salt"
second += ["salt"]
str = "rock"
seq = ["gneiss", "shale"]
print "before"
print "str is:", str
print "seq is:", seq
add_salt(str, seq[:])
print "after"
print "str is:", str
print "seq is:", seq
Because of the slice when calling add_salt, seq still only has gneiss and shale. Had we just passed in the variable seq to the function it would have had gneiss, shale and salt.
Default Parameter Values
To use parameter defaults, you assign a value to the parameter in the function definition. Here we set a default for the start and end parameters for our function
def total(values, start=0, end=None):
# If no values given, total is zero.
if not values:
return 0
# If no end specified, use the entire sequence.
if end is None:
end = len(values)
# Calculate.
result = 0
for i in range(start, end):
result += values[i]
return result
Now when can call the function 3 different ways
- Just values
60
- values and start
30
- values, start and end
60
One thing to remember when using defaults is that they have to come after all the parameters that do not have defaults. If they were intermixed, Python would not be able to figure out if the value passed in was to goto the parameter with the default or to the one without.
Functions are Objects
- redefine functions (just as you can reassign values to variables)
... return 2 * 3.14 * r
...
>>> def circumference(r):
... return 2 * 3.14159 * r
...
>>> print circumference(1.0)
6.28318
- create aliases for functions
>>> print circ(1.0)
6.28318
- pass functions as parameters
... result = []
... for v in values:
... temp = function(v)
... result.append(temp)
... return result
...
>>> radii = [0.1, 1.0, 10.0]
>>> print applu_to_list(circ, radii)
[0.62831800000000004, 6.2831799999999998, 62.831800000000001]
- store functions in lists
... return 3.14159 * r * r
...
>>> def color(r):
... return "unknown"
...
>>> def apply_each(functions, value):
... result = []
... for f in functions:
... temp = f(value)
... result.append(temp)
... return result
...
>>> functions = [circumference, area, color]
>>> print apply_each(functions, 1.0)
[6.2831799999999998, 3.1415899999999999, 'unknown']
Built-in object types (like functions) have different attributes by default. Every function has a __name__ attribute which contains the name it was originally defined as.
>>> print circ.__name__
circumference
The double underscores indicates that this is a reserved name (which will come up again in a later lesson).
Creating Modules
Here is the contents of geology.py
def rock_type(rock_name):
if rock_name in ['basalt', 'granite']:
return 'igneous'
elif rock_name in ['sandstone', 'shale']:
return 'sedimentary'
else:
return 'metamorphic'
To refer to the contents of a module as module.thing, just like attributes of any other object. Or in this particular case, geology.rock_type
>>> import geology
>>> for r in ['granite', 'gneiss']:
... print r, 'is', geology.rock_type(r)
...
granite is igneous
gneiss is metamorphic
Module Scope
Here are 2 modules, outer and inner to illustrate this scope resolution.
outer.py:
manager = "Albus Dumbledore"
import inner
print "outer:", manager
print "inner:", inner.get_manager()
inner.py:
manager = "Lucius Malfoy"
def get_manager():
return manager
output:
outer: Albus Dumbledore
inner: Lucius Malfoy
Importing
Statements in a module are executed as it is loaded. These include assignments, module definition and even other imports. This modified version of geology.py (from above) shows this execution
print 'loading geology module'
def rock_type(rock_name):
if rock_name in ['basalt', 'granite']:
return 'igneous'
elif rock_name in ['sandstone', 'shale']:
return 'sedimentary'
else:
return 'metamorphic'
print 'geology module loaded'
Now when we make use of this module the output has some extra lines
>>> import geology
loading geology module
geology module loaded
>>> for r in ['granite', 'gneiss']:
... print r, 'is', geology.rock_type(r)
...
granite is igneous
gneiss is metamorphic
It is considered bad form when importing something as a module to produce output in the importer. But what about self-tests, or the ability to run a module standalone? Much like functions have a __name__ attribute, so do modules (one of the benefits of being an object). It can be set to one of two things.
- The module's name if it has been imported
- The string __main__ if it is the main program
self-test.py:
def is_rock(name):
return name in ['basalt', 'granite', 'sandstone', 'shale']
if __name__ == '__main__':
tests = [['basalt', True], ['gingerale', False],
[12345678, False], ['sandstone', True]]
for (value, expected) in tests:
actual = is_rock(value)
if actual == expected:
print 'pass'
else:
print 'fail'
standalone:
$ python self_test.py
pass
pass
pass
pass
as a module:
>>> import self_test
>>> self_test.is_rock('sugar')
False
Finally, there are a couple variations of the regular import syntax that you might encounter
- import geology as g
-
- imports the module geology using the alias g
- usage: g.printversion()
- from geology import print_version
-
- imports the print_version function from the geology module into the default namespace
- usage: printversion()
- from geology import * - imports everything into the default namespace
-
- almost always a bad idea
- the next version of the module might add something (variable, function, etc) named the same as something you are using
The System (sys) Module
Command Line Arguments
sys.argv contains the program's command-line arguments with the program's name always as index 0.
command_line.py:
import sys
for i in range(len(sys.argv)):
print i, sys.argv[i]
no arguments:
$ python command_line.py
0 command_line.py
two arguments:
$ python command_line.py first second
0 command_line.py
1 first
2 second
Standard I/O
sys.stdin and sys.stdout are standard input and output which are normally connected to the keyboard and display, but if you redirect or use a pipe when running your program it attaches to them instead.
standard_io.py:
import sys
count = 0
for line in sys.stdin.readlines():
count += 1
sys.stdout.write('read ' + str(count) + ' lines')
using standard_io.py:
$ python standard_io.py < standard_io.py
$ read 7 lines
sys.stderr is also connected to standard error.
The Python Search Path
sys.path is the list of places Python will look to find a module for import.
- initialized from the PYTHONPATH environment variable
- the directory containing the program being run is automatically at the start of the list
- because it is a list, you can add or remove values from it
during program execution as necessary
- ./geology.py
- /home/swc/lib/geology.py
- /Python25/lib/geology.py
Exiting
sys.exit terminates the program and returns a status code to the operating system
- 0 indicates success (0 errors)
- non-zero is an error code
The Operating System (os) Module
Working with the File System
In order to make your Python program as platform neutral as possible, the os module provides a number of constants for values that are platform specific. Python is smart enough to know which version to use at runtime.
- curdir
-
- the current directory
- . in Linux and Windows
- pardir
-
- the parent directory
- .. in Linux and Windows
- sep
-
- the seperator character used in paths
- / in Linux, \ in Windows
- linesep
-
- The end-of-line marker used in text files
- \n in Linux, \r\n in Windows
- listdir(some_path)
-
- lists the contents of a provided path excluding . and ..
- mkdir(some_path)
-
- creates the specified directory
- remove(some_file)
-
- deletes the file (if the user running Python has the appropriate permissions)
- rename(old_name, new_name)
-
- renames (or moves) the file at old_name to new_name
- rmdir(some_path)
-
- deletes a directory
- be very careful, this cannot be undone
Summary
Before showing you the last 20% of the basic Python language, the next two lessons deal further with the notion of Style and introduce ideas around Quality.