Notebook download

Testing frameworks¶

Why use testing frameworks?¶

Frameworks should simplify our lives:

Should be easy to add simple test
Should be possible to create complex test:
- Fixtures
- Setup/Tear down
- Parameterized tests (same test, mostly same input)
Find all our tests in a complicated code-base
Run all our tests with a quick command
Run only some tests, e.g. test --only "tests about fields"
Report failing tests
Additional goodies, such as code coverage

Common testing frameworks¶

Language agnostic: CTest
- Test runner for executables, bash scripts, etc...
- Great for legacy code hardening
C unit-tests:
- all c++ frameworks,
- Check,
- CUnit
C++ unit-tests:
- CppTest,
- Boost::Test,
- google-test,
- Catch (best)
Python unit-tests:
- nose includes test discovery, coverage, etc
- unittest comes with standard python library
- pytest, branched off of nose
R unit-tests:
- RUnit,
- svUnit
- (works with SciViews GUI)
Fortran unit-tests:
- funit,
- pfunit(works with MPI)

pytest framework: usage¶

pytest is a recommended python testing framework.

We can use its tools in the notebook for on-the-fly tests in the notebook. This, happily, includes the negative-tests example we were looking for a moment ago.

In [1]:

def I_only_accept_positive_numbers(number):
    # Check input
    if number < 0: 
        raise ValueError("Input {} is negative".format(number))

    # Do something

In [2]:

from pytest import raises

In [3]:

with raises(ValueError):
    I_only_accept_positive_numbers(-5)

but the real power comes when we write a test file alongside our code files in our homemade packages:

In [4]:

%%bash
mkdir -p saskatchewan
touch saskatchewan/__init__.py

In [5]:

%%writefile saskatchewan/overlap.py
def overlap(field1, field2):
    left1, bottom1, top1, right1 = field1
    left2, bottom2, top2, right2 = field2
    
    overlap_left = max(left1, left2)
    overlap_bottom = max(bottom1, bottom2)
    overlap_right = min(right1, right2)
    overlap_top = min(top1, top2)
    # Here's our wrong code again
    overlap_height = (overlap_top - overlap_bottom)
    overlap_width = (overlap_right - overlap_left)
    
    return overlap_height * overlap_width

Writing saskatchewan/overlap.py

In [6]:

%%writefile saskatchewan/test_overlap.py
from .overlap import overlap

def test_full_overlap():
    assert overlap((1.,1.,4.,4.), (2.,2.,3.,3.)) == 1.0

def test_partial_overlap():
    assert overlap((1,1,4,4), (2,2,3,4.5)) == 2.0
                 
def test_no_overlap():
    assert overlap((1,1,4,4), (4.5,4.5,5,5)) == 0.0

Writing saskatchewan/test_overlap.py

In [7]:

%%bash --no-raise-error
cd saskatchewan
pytest

============================= test session starts ==============================
platform linux -- Python 3.8.18, pytest-7.4.3, pluggy-1.3.0
rootdir: /home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch03tests/saskatchewan
plugins: cov-4.1.0, anyio-3.7.1
collected 3 items

test_overlap.py ..F                                                      [100%]

=================================== FAILURES ===================================
_______________________________ test_no_overlap ________________________________

    def test_no_overlap():
>       assert overlap((1,1,4,4), (4.5,4.5,5,5)) == 0.0
E       assert 0.25 == 0.0
E        +  where 0.25 = overlap((1, 1, 4, 4), (4.5, 4.5, 5, 5))

test_overlap.py:10: AssertionError
=========================== short test summary info ============================
FAILED test_overlap.py::test_no_overlap - assert 0.25 == 0.0
 +  where 0.25 = overlap((1, 1, 4, 4), (4.5, 4.5, 5, 5))
========================= 1 failed, 2 passed in 0.09s ==========================

Note that it reported which test had failed, how many tests ran, and how many failed.

The symbol ..F means there were three tests, of which the third one failed.

Pytest will:

automagically finds files test_*.py
collects all subroutines called test_*
runs tests and reports results

Some options:

help: pytest --help
run only tests for a given feature: pytest -k foo # tests with 'foo' in the test name

Testing with floating points¶

Floating points are not reals¶

Floating points are inaccurate representations of real numbers:

1.0 == 0.99999999999999999 is true to the last bit.

This can lead to numerical errors during calculations: $1000 (a - b) \neq 1000a - 1000b$

In [8]:

1000.0 * 1.0 - 1000.0 * 0.9999999999999998

Out[8]:

2.2737367544323206e-13

In [9]:

1000.0 * (1.0 - 0.9999999999999998)

Out[9]:

2.220446049250313e-13

Both results are wrong: 2e-13 is the correct answer.

The size of the error will depend on the magnitude of the floating points:

In [10]:

1000.0 * 1e5 - 1000.0 * 0.9999999999999998e5

Out[10]:

1.4901161193847656e-08

The result should be 2e-8.

Comparing floating points¶

Use the "approx", for a default of a relative tolerance of $10^{-6}$

In [11]:

from pytest import approx
assert  0.7 == approx(0.7 + 1e-7)

Or be more explicit:

In [12]:

magnitude = 0.7
assert 0.7 == approx(0.701 , rel=0.1, abs=0.1)

Choosing tolerances is a big area of debate.

Comparing vectors of floating points¶

Numerical vectors are best represented using numpy.

In [13]:

from numpy import array, pi

vector_of_reals = array([0.1, 0.2, 0.3, 0.4]) * pi

Numpy ships with a number of assertions (in numpy.testing) to make comparison easy:

In [14]:

from numpy import array, pi
from numpy.testing import assert_allclose
expected = array([0.1, 0.2, 0.3, 0.4, 1e-12]) * pi
actual = array([0.1, 0.2, 0.3, 0.4, 2e-12]) * pi
actual[:-1] += 1e-6

assert_allclose(actual, expected, rtol=1e-5, atol=1e-8)

It compares the difference between actual and expected to atol + rtol * abs(expected).