Frameworks should simplify our lives:
- Should be easy to add simple test
- Should be possible to create complex test:
- Fixtures
- Setup/Tear down
- Parameterized tests (same test, mostly same input)
- Find all our tests in a complicated code-base
- Run all our tests with a quick command
- Run only some tests, e.g.
test --only "tests about fields"
- Report failing tests
- Additional goodies, such as code coverage
Common testing frameworks¶
Language agnostic: CTest
- Test runner for executables, bash scripts, etc...
- Great for legacy code hardening
C unit-tests:
C++ unit-tests:
- CppTest,
- Boost::Test,
- google-test,
- Catch (best)
Python unit-tests:
R unit-tests:
Fortran unit-tests:
We can use its tools in the notebook for on-the-fly tests in the notebook. This, happily, includes the negative-tests example we were looking for a moment ago.
def I_only_accept_positive_numbers(number):
# Check input
if number < 0:
raise ValueError("Input {} is negative".format(number))
# Do something
from pytest import raises
with raises(ValueError):
I_only_accept_positive_numbers(-5)
but the real power comes when we write a test file alongside our code files in our homemade packages:
%%bash
mkdir -p saskatchewan
touch saskatchewan/__init__.py
%%writefile saskatchewan/overlap.py
def overlap(field1, field2):
left1, bottom1, top1, right1 = field1
left2, bottom2, top2, right2 = field2
overlap_left = max(left1, left2)
overlap_bottom = max(bottom1, bottom2)
overlap_right = min(right1, right2)
overlap_top = min(top1, top2)
# Here's our wrong code again
overlap_height = (overlap_top - overlap_bottom)
overlap_width = (overlap_right - overlap_left)
return overlap_height * overlap_width
%%writefile saskatchewan/test_overlap.py
from .overlap import overlap
def test_full_overlap():
assert overlap((1.,1.,4.,4.), (2.,2.,3.,3.)) == 1.0
def test_partial_overlap():
assert overlap((1,1,4,4), (2,2,3,4.5)) == 2.0
def test_no_overlap():
assert overlap((1,1,4,4), (4.5,4.5,5,5)) == 0.0
%%bash --no-raise-error
cd saskatchewan
pytest
Note that it reported which test had failed, how many tests ran, and how many failed.
The symbol ..F
means there were three tests, of which the third one failed.
Pytest will:
- automagically finds files
test_*.py
- collects all subroutines called
test_*
- runs tests and reports results
Some options:
- help:
pytest --help
- run only tests for a given feature:
pytest -k foo
# tests with 'foo' in the test name
This can lead to numerical errors during calculations: $1000 (a - b) \neq 1000a - 1000b$
1000.0 * 1.0 - 1000.0 * 0.9999999999999998
1000.0 * (1.0 - 0.9999999999999998)
Both results are wrong: 2e-13
is the correct answer.
The size of the error will depend on the magnitude of the floating points:
1000.0 * 1e5 - 1000.0 * 0.9999999999999998e5
The result should be 2e-8
.
Comparing floating points¶
Use the "approx", for a default of a relative tolerance of $10^{-6}$
from pytest import approx
assert 0.7 == approx(0.7 + 1e-7)
Or be more explicit:
magnitude = 0.7
assert 0.7 == approx(0.701 , rel=0.1, abs=0.1)
Choosing tolerances is a big area of debate.
from numpy import array, pi
vector_of_reals = array([0.1, 0.2, 0.3, 0.4]) * pi
Numpy ships with a number of assertions (in numpy.testing
) to make
comparison easy:
from numpy import array, pi
from numpy.testing import assert_allclose
expected = array([0.1, 0.2, 0.3, 0.4, 1e-12]) * pi
actual = array([0.1, 0.2, 0.3, 0.4, 2e-12]) * pi
actual[:-1] += 1e-6
assert_allclose(actual, expected, rtol=1e-5, atol=1e-8)
It compares the difference between actual
and expected
to atol + rtol * abs(expected)
.