Jump To Content

LearnHub




Quality & Testing

Introduction

The more you invest in quality, the less time it takes to develop working software. Quality is not just testing. Quality is
  • designed in
  • monitored and maintained through the whole software lifecycle
This lessons looks at the basic things every developer can do to maintain quality.

Testing Terminology

This lesson uses a number of terms throughout it which should be defined before going any further
  • unit test - a developer-oriented test that exercises one component in isolation
  • integration test - a user-oriented test that exercises the whole system to determine its overall behavior
  • regression test
    • a test which checks that code which used to work still continues to work
    • regression tests greatly reduce the chance of accidental breakage when refactoring or maintenance
  • test result - a test can have exactly one of three outcomes:
    • pass - the actual outcome matches the expected outcome
    • fail - the actual outcome is different from what was expected
    • error - something went wrong in the test (i.e., the test contains a bug)
  • oracle - an oracle tells you how to classify a test's result
    • tests without an oracle are just blind guesses as to what the outcome should be

Structuring Tests

In order to make your tests as useful, effective and maintainable as possible, a test should have the following properties
  • independence - the outcome of a test does not depend on the outcome of another otherwise faults in early tests can distort the results of later ones
  • test only one thing - when tests contain more than item being checked, then you can run in to situations where you do not have a clear pass or fail and run into the realm of maybe and sorta
  • self contained - a test both sets up the environment for the test to execute as well as cleans up after itself

A Simple Example

Here is a simplistic example which shows how easy it is to create a test. In this example the string object methods startswith is tested.

Tests = [
# String Prefix Expected
['a', 'a', True],
['a', 'b', False],
['abc', 'a', True],
['abc', 'ab', True],
['abc', 'abc', True],
['abc', 'abcd', False],
['abc', '', True]
]

passes = 0
failures = 0
for (s, p, expected) in Tests:
actual = s.startswith(p)
if actual == expected:
passes += 1
else:
failures += 1
print 'passed', passes, 'out of', passes+failures, 'tests'

When this is run, the output is

passed 7 out of 7 tests

Adding new tests to this is as easy as adding another row to the Tests list.

Catching Errors

In the above example, the tests all executed smoothly. But what would have happened if there was a problem during the actual test?

Python uses exceptions for error handling which separate normal operation from error handling. This makes the code easier to read in both situations.

Exception handling blocks are structured just like if/else ones; the healthy case goes in the try block, and the error situation is in the except one. When something goes wrong in the true, Python raises an exception by the matching except.

There is also an optional else block which is executed when nothing goes wrong in the try.

Here is an example which uses exception handling to test Python's handling of dividing by zero.

for num in [-1, 0, 1]:
try:
inverse = 1/num
except:
print 'inverting', num, 'caused error'
else:
print 'inverse of', num, 'is', inverse

And a visual representation of what was executed. For -1 and 1, the normal path was followed, but for 0 the error path is followed.



Exception Objects

When Python raises an exception, it creates an object to hold information about what went wrong. This Exception Object also typically includes a human friendly message along with the exception data.

Above we were handling all exceptions in the except statement. By specifying an exception type in the except you can treat different types of exception differently. The following example shows how to handle two exception types (ZeroDivisionError and IndexError) as well as a default handler.

# Note: mix of numeric and non-numeric values.
values = [1, 0, 'momentum']

# Note: top index will be out of bounds.
for i in range(4):
try:
print 'dividing by value', i
x = 1.0 / values[i]
print 'result is', x
except ZeroDivisionError, e:
print 'divide by zero:', e
except IndexError, e:
print 'index error:', e
except:
print 'some other error'

except blocks are tested in order which results in this output:

dividing by value 0
result is 1.0
dividing by value 1
divide by zero: float division
dividing by value 2
some other error: float division
dividing by value 3
index error: list index out of range

Exception Hierarchy

Because exceptions are all objects, they are organized hierarchically. The exception hierarchy is available in the Python documentation. Being familiar with the hierarchy is useful for handling broad categories of exceptions since the as the general exception class can handle the more specific ones below it.

Rather than handling FloatingPointError, OverflowError, ZeroDivisionError each in a duplicate manner you could use the more general ArithmeticError.

A naked except

except:

has has an implied exception type of Exception which is the top of the hierarchy. As a result it any naked except statements, or ones that handle Exception must be at the end of the except block. A better format to write this default handler is

except Exception, e:

which will give you an exception object.

Raising Exceptions

To trigger an exception, you raise it. When you raise one you also specify what type of exception you are raising and optionally an informative message which will help someone debug the reason for the exception.

for i in range(4):
try:
if (i % 2) == 1:
raise ValueError('index is odd')
else:
print 'not raising exception for %d' % i
except ValueError, e:
print 'caught exception for %d' % i, e

Exceptional Style

  • Always use exceptions to report errors instead of returning None, -1, False, or some other value
  • Allows callers to separate normal code from error handling
  • Sooner or later, your function will probably actually want to return that "special" value
  • Throw low, catch high
  • I.E., throw lots of very specific exception, but only catch them where you can actually take corrective action
  • Every application handles errors differently: if someone is using your library in a GUI, you don't want to be printing to stderr

Handling Errors in Tests

Now that we can handle errors, the startswith test from above can start creating a more robust test

Tests = [
['a', 'a', False], # wrong expected value
['a', 1, False], # wrong type
['abc', 'a', True] # everything legal
]

passes = failures = errors = 0
for (s, p, expected) in Tests:
try:
actual = s.startswith(p)
if actual == expected:
passes += 1
else:
failures += 1
except Exception, e:
errors += 1

Without the exception block, the third test would not run due to an exception caused by passing the startswith method a number.

Quality Oriented Programming Styles

Test-Driven Design

Tests are actually specifications; given these inputs, this could should behave in the following way. Since code is written against a specification, it makes sense to create the tests (specification) first, then the actual application code. This inverting of the order is called Test-Driven Design (or TDD).

While it sounds backwards (and it is from a historical perspective) it has the following benefits:
  • A great way to clarify specifications
  • Gives programmers a definite goal - feature complete once all the tests pass
  • Ensures that tests actually get written - there is never enough time 'later'
  • Helps clarify the Application Programming Interface (API) - if it is hard to write tests for, then it should be refactored.
Compare these 2 methods of providing a specification:
  1. The function run_sum calculates a running sum of the values in a list

Tests = [
[[], [], 'empty list'],
[[1], [1], 'single value'],
[[1, 3], [1, 4], 'two values'],
[[1, 3, 7], [1, 4, 11], 'three values'],
[[-1, 1], [-1, 0], 'negative values'],
[[1, 3.0], [1, 4.0], 'mixed types'],
["string", ValueError, 'non-list input'],
[['a'], ValueError, 'non-numeric value']
]
  • If the expected result is an exception, pass only if that exception is raised
  • If the test doesn't pass, print the comment so that the programmer knows what to look at
The first method does not specify whether to create a new list, or overwrite the input nor does it specify how to handle errors. The second method is in a format programmers are used to dealing with and shows some interesting behaviors that were not even mentioned in the first one.

Design by Contract

Design By Contract is a style of programming where a defined, checkable interfaces. The principle benefits of this style of programming are:
  • Keeping specification and implementation together makes both easier to understand
  • Improves the odds that programmers will keep them in sync
A function is defined by its:
  • pre-conditions - Conditions which must be true at the start of a function or method in order for it to execute correctly
  • post-conditions - Conditions which a function or method guarantees will be true if it terminates normally
  • invariants - Conditions that are true throughout the execution of the function
Both pre- and post-conditions constrain how the function can evolve
  • Can only ever relax pre-conditions (i.e., take a wider range of input)
  • Tighten post-conditions (i.e., produce a narrower range of output)
Tightening pre-conditions, or relaxing post-conditions, would violate the function's contract with its callers

Contract conditions are specified using assertions. An assertion is a statement that something is true (at that particular point in the program). If it is not true, then an AssertionError is raised.

Here is an example of a contract that is verified using assertions.

def find_range(values):
'''Find the non-empty range of values in the input sequence.'''
assert (type(values) is list) and (len(values) > 0)
left = min(values)
right = max(values)
assert (left in values) and (right in values) and (left <= right)
return left, right

Reading this we see that the find_range function has the following contract
  • Pre-condition: input argument is a non-empty list
  • Post-condition: two values from the list such that the first is less than the second
The post-condition in the example is not as exact as it could be though as it does not check that left is less than or equal to all other values, or that right is greater than or equal to. Also not the complexity of the code to verify the post-condition. One of the reasons Design By Contract is not as successful as it might be is because the code to check the condition exactly is just as likely to have a bug in it as the code it is supposed to verify.

Defensive Programming

Defensive programming is like defensive driving
  • Program as if the rest of the world is out to get you
  • "Fail early, fail often" - The less distance there is between the error and you detecting it, the more likely it will be easier to find and fix
This often means the liberal use of assert even if you do not practice Design By Contract. In fact, a good habit to get into is to put in an assertion and comment each time you fix a bug. This protects you in two ways
  1. if you made the error, the right code can't be obvious
  2. lessens the chance that someone will "simplify" the bug back in
Here you can see defensive programming in action. Specifically notice the defenses in place to prevent bugs 172 and 201 from returning.

def can_transmute(element):
'''Can this element be turned into gold?'''

# Bug #172: make sure the input is actually an element.
assert is_valid_element(element)

# Gold is trivial.
if element is Gold:
return True

# Trans-uranic metals and halogens are impossible.
if (element.atomic_number > Uranium.atomic_number) or \
(element in Halogens):
return False

# Look for a sequence of steps that leads to gold.
steps = search_transmutations(element, Gold)
if steps == []:
return False
else:
# Bug #201: must be at least two elements in sequence.
assert len(steps) >= 2
return True

Summary

The real goal of this lesson is not about finding bugs, it is about figuring out where they are coming from so they can be prevented. The only way anything resembling quality can be reached is by designing it in. Without a commitment to quality at a base level, no amount of testing will be able to find all the problems.

Your Comment
Textile is Enabled (View Reference)