XClose

MPHY0021: Research Software Engineering With Python

Home
Menu

Packaging

Once we've made a working program, we'd like to be able to share it with others.

A good cross-platform build tool is the most important thing: you can always have collaborators build from source.

Distribution tools

Distribution tools allow one to obtain a working copy of someone else's package.

  • Language-specific tools:

    • python: PyPI,
    • ruby: Ruby Gems,
    • perl: CPAN,
    • R: CRAN
  • Platform specific packagers e.g.:

    • brew for MacOS,
    • apt/dnf/pacman for Linux or
    • choco for Windows.

Laying out a project

When planning to package a project for distribution, defining a suitable project layout is essential. A typical layout might look like this:

repository_name
|-- module_name
|   |-- __init__.py
|   |-- python_file.py
|   |-- another_python_file.py
|   `-- test
|       |-- fixtures
|       |   `-- fixture_file.yaml
|       |-- __init__.py
|       `-- test_python_file.py
|-- LICENSE.md
|-- CITATION.md
|-- README.md
`-- setup.py

To achieve this for our greetings.py file from the previous session, we can use the commands shown below. We can start by making our directory structure. You can create many nested directories at once using the -p switch on mkdir.

In [1]:
%%bash
mkdir -p greetings_repo/greetings/test/fixtures

For this notebook, since we are going to be modifying the files bit by bit, we are going to use the autoreload ipython magic so that we don't need to restart the kernel.

In [2]:
%load_ext autoreload
%autoreload 2

Using setuptools

To make python code into a package, we need to write a setup.py file. For now we are adding only the name of the package and its version number.

In [3]:
%%writefile greetings_repo/setup.py

from setuptools import setup, find_packages

setup(
    name="Greetings",
    version="0.1.0",
    packages=find_packages(),
)
Writing greetings_repo/setup.py

We can now install this "package" with pip:

In [4]:
%%bash
cd greetings_repo
pip install .
Processing /home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: Greetings
  Building wheel for Greetings (setup.py): started
  Building wheel for Greetings (setup.py): finished with status 'done'
  Created wheel for Greetings: filename=Greetings-0.1.0-py3-none-any.whl size=996 sha256=e1a8e4b857173449e7da264c537e9e35876fcf0c4133944a3301fec1e8cf6406
  Stored in directory: /tmp/pip-ephem-wheel-cache-d171kt1a/wheels/72/c9/16/35fe5e911cb17283cd810e889cbfc87301d515c880058a12cf
Successfully built Greetings
Installing collected packages: Greetings
  Attempting uninstall: Greetings
    Found existing installation: Greetings 0.1.0
    Uninstalling Greetings-0.1.0:
      Successfully uninstalled Greetings-0.1.0
Successfully installed Greetings-0.1.0

And the package will be then available to use everywhere on the system. But so far this package doesn't contain anythin and there's nothing we can run! We need to add some files first.

To create a regular package, we needed to have __init__.py files on each subdirectory that we want to be able to import. This is, since version 3.3 and the introduction of Implicit Namespaces Packages, not needed anymore. However, if you want to use relative imports and pytest, then you still need to have these files.

The __init__.py files can contain any initialisation code you want to run when the (sub)module is imported.

For this example, and because we are using relative imports in the tests, we are creating the needed __init__.py files.

In [5]:
%%bash

touch greetings_repo/greetings/__init__.py

And we can copy the greet function from the previous section in the greeter.py file.

In [6]:
%%writefile greetings_repo/greetings/greeter.py

def greet(personal, family, title="", polite=False):
    greeting = "How do you do, " if polite else "Hey, "
    if title:
        greeting += f"{title} "

    greeting += f"{personal} {family}."
    return greeting
Writing greetings_repo/greetings/greeter.py

For the changes to take effect, we need to reinstall the library:

In [7]:
%%bash
cd greetings_repo
pip install .
Processing /home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: Greetings
  Building wheel for Greetings (setup.py): started
  Building wheel for Greetings (setup.py): finished with status 'done'
  Created wheel for Greetings: filename=Greetings-0.1.0-py3-none-any.whl size=1465 sha256=b14e84077d78fb2b5845b1d846658ce3877f45289b8cdcf80b063d3ad1423c15
  Stored in directory: /tmp/pip-ephem-wheel-cache-zjr_1gc9/wheels/72/c9/16/35fe5e911cb17283cd810e889cbfc87301d515c880058a12cf
Successfully built Greetings
Installing collected packages: Greetings
  Attempting uninstall: Greetings
    Found existing installation: Greetings 0.1.0
    Uninstalling Greetings-0.1.0:
      Successfully uninstalled Greetings-0.1.0
Successfully installed Greetings-0.1.0

And now we are able to import it and use it:

In [8]:
from greetings.greeter import greet
greet("Terry","Gilliam")
Out[8]:
'Hey, Terry Gilliam.'

Convert the script to a module

Of course, there's more to do when taking code from a quick script and turning it into a proper module:

We need to add docstrings to our functions, so people can know how to use them.

In [9]:
%%writefile greetings_repo/greetings/greeter.py

def greet(personal, family, title="", polite=False):
    """ Generate a greeting string for a person.
    Parameters
    ----------
    personal: str
        A given name, such as Will or Jean-Luc
    family: str
        A family name, such as Riker or Picard
    title: str
        An optional title, such as Captain or Reverend
    polite: bool
        True for a formal greeting, False for informal.
    Returns
    -------
    string
        An appropriate greeting
    Examples
    --------
    >>> from greetings.greeter import greet
    >>> greet("Terry", "Jones")
    'Hey, Terry Jones.
    """

    greeting = "How do you do, " if polite else "Hey, "
    if title:
        greeting += f"{title} "

    greeting += f"{personal} {family}."
    return greeting
Overwriting greetings_repo/greetings/greeter.py

We can see the documentation using help.

In [10]:
help(greet)
Help on function greet in module greetings.greeter:

greet(personal, family, title='', polite=False)

The documentation string explains how to use the function; don't worry about this for now, we'll consider this on the next section (notebook version).

Write an executable script

We can create an executable script, command.py that uses our greeting functionality and the process function we created in the previous section.

Note how we are importing greet using relative imports, where .greeter means to look for a greeter module within the same directory.

In [11]:
%%writefile greetings_repo/greetings/command.py

from argparse import ArgumentParser

from .greeter import greet


def process():
    parser = ArgumentParser(description="Generate appropriate greetings")

    parser.add_argument('--title', '-t')
    parser.add_argument('--polite', '-p', action="store_true")
    parser.add_argument('personal')
    parser.add_argument('family')

    arguments = parser.parse_args()

    print(greet(arguments.personal, arguments.family,
                arguments.title, arguments.polite))


if __name__ == "__main__":
    process()
Writing greetings_repo/greetings/command.py

Specify entry point

This allows us to create a command to execute part of our library. In this case when we execute greet on the terminal, we will be calling the process function under greetings/command.py.

In [12]:
%%writefile greetings_repo/setup.py

from setuptools import setup, find_packages

setup(
    name="Greetings",
    version="0.1.0",
    packages=find_packages(),
    entry_points={
        'console_scripts': [
            'greet = greetings.command:process'
        ]})
Overwriting greetings_repo/setup.py
In [13]:
%%bash
cd greetings_repo
pip install -e .
Obtaining file:///home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Installing collected packages: Greetings
  Attempting uninstall: Greetings
    Found existing installation: Greetings 0.1.0
    Uninstalling Greetings-0.1.0:
      Successfully uninstalled Greetings-0.1.0
  Running setup.py develop for Greetings
Successfully installed Greetings-0.1.0

And the scripts are now available as command line commands, so the following commands can now be run:

In [14]:
%%bash
greet --help
usage: greet [-h] [--title TITLE] [--polite] personal family

Generate appropriate greetings

positional arguments:
  personal
  family

optional arguments:
  -h, --help            show this help message and exit
  --title TITLE, -t TITLE
  --polite, -p
In [15]:
%%bash
greet Terry Gilliam
greet --polite Terry Gilliam
greet Terry Gilliam --title Cartoonist
Hey, Terry Gilliam.
How do you do, Terry Gilliam.
Hey, Cartoonist Terry Gilliam.

Specify dependencies

Let's give some live to our output using ascii art

In [16]:
%%writefile greetings_repo/greetings/command.py

from argparse import ArgumentParser

from art import art

from .greeter import greet


def process():
    parser = ArgumentParser(description="Generate appropriate greetings")

    parser.add_argument('--title', '-t')
    parser.add_argument('--polite', '-p', action="store_true")
    parser.add_argument('personal')
    parser.add_argument('family')

    arguments = parser.parse_args()

    message = greet(arguments.personal, arguments.family,
                    arguments.title, arguments.polite)
    print(art("cute face"), message)

if __name__ == "__main__":
    process()
Overwriting greetings_repo/greetings/command.py

We use the setup.py file to specify the packages we depend on using install_requires:

In [17]:
%%writefile greetings_repo/setup.py

from setuptools import setup, find_packages

setup(
    name="Greetings",
    version="0.1.0",
    packages=find_packages(),
    install_requires=['art', 'pyyaml'],
    entry_points={
        'console_scripts': [
            'greet = greetings.command:process'
        ]}    
    )
Overwriting greetings_repo/setup.py

When installing the package now, pip will also install the dependencies automatically.

In [18]:
%%bash
cd greetings_repo
pip install -e .
Obtaining file:///home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting art
  Downloading art-5.3-py2.py3-none-any.whl (574 kB)
Requirement already satisfied: pyyaml in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from Greetings==0.1.0) (6.0)
Installing collected packages: art, Greetings
  Attempting uninstall: Greetings
    Found existing installation: Greetings 0.1.0
    Uninstalling Greetings-0.1.0:
      Successfully uninstalled Greetings-0.1.0
  Running setup.py develop for Greetings
Successfully installed Greetings-0.1.0 art-5.3
In [19]:
%%bash
greet Terry Gilliam
(。◕‿◕。)  Hey, Terry Gilliam.

Installing from GitHub

We could now submit "greeter" to PyPI for approval, so everyone could pip install it.

However, when using git, we don't even need to do that: we can install directly from any git URL:

pip install git+git://github.com/ucl-rits/greeter
$ greet Lancelot the-Brave --title Sir
Hey, Sir Lancelot the-Brave.


There are a few additional text files that are important to add to a package: a readme file, a licence file and a citation file.

Write a readme file

The readme file might look like this:

In [20]:
%%writefile greetings_repo/README.md

# Greetings!

This is a very simple example package used as part of the UCL
[Research Software Engineering with Python](development.rc.ucl.ac.uk/training/engineering) course.

## Installation

```bash
pip install git+git://github.com/ucl-rits/greeter
```

## Usage
    
Invoke the tool with `greet <FirstName> <Secondname>` or use it on your own library:

```python
from greeting import greeter

greeter.greet(user.name, user.lastname)
```
Writing greetings_repo/README.md

Write a license file

We will discus more about licensing in a later section. For now let's assume we want to release this package into the public domain:

In [21]:
%%writefile greetings_repo/LICENSE.md

(C) University College London 2014

This "greetings" example package is granted into the public domain.
Writing greetings_repo/LICENSE.md

Write a citation file

A citation file will inform our users how we would like to be cited when refering to our software:

In [22]:
%%writefile greetings_repo/CITATION.md

If you wish to refer to this course, please cite the URL
http://github-pages.ucl.ac.uk/rsd-engineeringcourse/

Portions of the material are taken from [Software Carpentry](http://software-carpentry.org/)
Writing greetings_repo/CITATION.md

You may well want to formalise this using the codemeta.json standard or the citation file format - these don't have wide adoption yet, but we recommend it.

Define packages and executables

We need to create __init__ files for the source and the tests.

touch greetings/greetings/test/__init__.py
touch greetings/greetings/__init__.py

Write some unit tests

We can now write some tests to our library.

Remember, that we need to create the empty __init__.py files so that pytest can follow the relative imports.

In [23]:
%%bash
touch greetings_repo/greetings/test/__init__.py

Separating the script from the logical module made this possible.

In [24]:
%%writefile greetings_repo/greetings/test/test_greeter.py

import os

import yaml

from ..greeter import greet

def test_greet():
    with open(os.path.join(os.path.dirname(__file__),
                           'fixtures',
                           'samples.yaml')) as fixtures_file:
        fixtures = yaml.safe_load(fixtures_file)
        for fixture in fixtures:
            answer = fixture.pop('answer')
            assert greet(**fixture) == answer
Writing greetings_repo/greetings/test/test_greeter.py

Add a fixtures file:

In [25]:
%%writefile greetings_repo/greetings/test/fixtures/samples.yaml

- personal: Eric
  family: Idle
  answer: "Hey, Eric Idle."
- personal: Graham
  family: Chapman
  polite: True
  answer: "How do you do, Graahm Chapman."
- personal: Michael
  family: Palin
  title: CBE
  answer: "Hey, CBE Mike Palin."  
Writing greetings_repo/greetings/test/fixtures/samples.yaml

We can now run pytest

In [26]:
%%bash --no-raise-error

cd greetings_repo
pytest
============================= test session starts ==============================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
plugins: cov-3.0.0
collected 1 item

greetings/test/test_greeter.py F                                         [100%]

=================================== FAILURES ===================================
__________________________________ test_greet __________________________________

    def test_greet():
        with open(os.path.join(os.path.dirname(__file__),
                               'fixtures',
                               'samples.yaml')) as fixtures_file:
            fixtures = yaml.safe_load(fixtures_file)
            for fixture in fixtures:
                answer = fixture.pop('answer')
>               assert greet(**fixture) == answer
E               AssertionError: assert 'How do you d...aham Chapman.' == 'How do you d...aahm Chapman.'
E                 - How do you do, Graahm Chapman.
E                 ?                    -
E                 + How do you do, Graham Chapman.
E                 ?                   +

greetings/test/test_greeter.py:15: AssertionError
=========================== short test summary info ============================
FAILED greetings/test/test_greeter.py::test_greet - AssertionError: assert 'H...
============================== 1 failed in 0.07s ===============================

However, this hasn't told us that also the third test is wrong too! A better aproach is to parametrize the testfile greetings_repo/greetings/test/test_greeter.py as follows:

In [27]:
%%writefile greetings_repo/greetings/test/test_greeter.py

import os

import pytest
import yaml

from ..greeter import greet

def read_fixture():
    with open(os.path.join(os.path.dirname(__file__),
                           'fixtures',
                           'samples.yaml')) as fixtures_file:
        fixtures = yaml.safe_load(fixtures_file)
    return fixtures

@pytest.mark.parametrize("fixture", read_fixture())
def test_greeter(fixture):
    answer = fixture.pop('answer')
    assert greet(**fixture) == answer
Overwriting greetings_repo/greetings/test/test_greeter.py

Now when we run pytest, we get a failure per element in our fixture and we know all that fails.

In [28]:
%%bash --no-raise-error

cd greetings_repo
pytest
============================= test session starts ==============================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
plugins: cov-3.0.0
collected 3 items

greetings/test/test_greeter.py .FF                                       [100%]

=================================== FAILURES ===================================
____________________________ test_greeter[fixture1] ____________________________

fixture = {'family': 'Chapman', 'personal': 'Graham', 'polite': True}

    @pytest.mark.parametrize("fixture", read_fixture())
    def test_greeter(fixture):
        answer = fixture.pop('answer')
>       assert greet(**fixture) == answer
E       AssertionError: assert 'How do you d...aham Chapman.' == 'How do you d...aahm Chapman.'
E         - How do you do, Graahm Chapman.
E         ?                    -
E         + How do you do, Graham Chapman.
E         ?                   +

greetings/test/test_greeter.py:19: AssertionError
____________________________ test_greeter[fixture2] ____________________________

fixture = {'family': 'Palin', 'personal': 'Michael', 'title': 'CBE'}

    @pytest.mark.parametrize("fixture", read_fixture())
    def test_greeter(fixture):
        answer = fixture.pop('answer')
>       assert greet(**fixture) == answer
E       AssertionError: assert 'Hey, CBE Michael Palin.' == 'Hey, CBE Mike Palin.'
E         - Hey, CBE Mike Palin.
E         ?            ^
E         + Hey, CBE Michael Palin.
E         ?            ^^^ +

greetings/test/test_greeter.py:19: AssertionError
=========================== short test summary info ============================
FAILED greetings/test/test_greeter.py::test_greeter[fixture1] - AssertionErro...
FAILED greetings/test/test_greeter.py::test_greeter[fixture2] - AssertionErro...
========================= 2 failed, 1 passed in 0.08s ==========================

We can also make pytest to check whether the docstrings are correct by adding the --doctest-modules flag. We run pytest --doctest-modules and obtain the following output:

In [29]:
%%bash --no-raise-error

cd greetings_repo
pytest --doctest-modules
============================= test session starts ==============================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo
plugins: cov-3.0.0
collected 4 items

greetings/greeter.py F                                                   [ 25%]
greetings/test/test_greeter.py .FF                                       [100%]

=================================== FAILURES ===================================
______________________ [doctest] greetings.greeter.greet _______________________
012     polite: bool
013         True for a formal greeting, False for informal.
014     Returns
015     -------
016     string
017         An appropriate greeting
018     Examples
019     --------
020     >>> from greetings.greeter import greet
021     >>> greet("Terry", "Jones")
Expected:
    'Hey, Terry Jones.
Got:
    'Hey, Terry Jones.'

/home/runner/work/rsd-engineeringcourse/rsd-engineeringcourse/ch04packaging/greetings_repo/greetings/greeter.py:21: DocTestFailure
____________________________ test_greeter[fixture1] ____________________________

fixture = {'family': 'Chapman', 'personal': 'Graham', 'polite': True}

    @pytest.mark.parametrize("fixture", read_fixture())
    def test_greeter(fixture):
        answer = fixture.pop('answer')
>       assert greet(**fixture) == answer
E       AssertionError: assert 'How do you d...aham Chapman.' == 'How do you d...aahm Chapman.'
E         - How do you do, Graahm Chapman.
E         ?                    -
E         + How do you do, Graham Chapman.
E         ?                   +

greetings/test/test_greeter.py:19: AssertionError
____________________________ test_greeter[fixture2] ____________________________

fixture = {'family': 'Palin', 'personal': 'Michael', 'title': 'CBE'}

    @pytest.mark.parametrize("fixture", read_fixture())
    def test_greeter(fixture):
        answer = fixture.pop('answer')
>       assert greet(**fixture) == answer
E       AssertionError: assert 'Hey, CBE Michael Palin.' == 'Hey, CBE Mike Palin.'
E         - Hey, CBE Mike Palin.
E         ?            ^
E         + Hey, CBE Michael Palin.
E         ?            ^^^ +

greetings/test/test_greeter.py:19: AssertionError
=========================== short test summary info ============================
FAILED greetings/greeter.py::greetings.greeter.greet
FAILED greetings/test/test_greeter.py::test_greeter[fixture1] - AssertionErro...
FAILED greetings/test/test_greeter.py::test_greeter[fixture2] - AssertionErro...
========================= 3 failed, 1 passed in 0.23s ==========================

Finally, if we don't want to include the tests when we distribute our software for our users, you can include that using the exclude option on find_packages on setup.py.

In [30]:
%%writefile greetings_repo/setup.py

from setuptools import setup, find_packages

setup(
    name="Greetings",
    version="0.1.0",
    packages=find_packages(exclude=['*.test']),
    install_requires=['art', 'pyyaml'],
    entry_points={
        'console_scripts': [
            'greet = greetings.command:process'
        ]}    
    )
Overwriting greetings_repo/setup.py

Developer Install

If you modify your source files, you would now find it appeared as if the program doesn't change.

That's because pip install copies the files.

If you want to install a package, but keep working on it, you can do:

pip install --editable .

or, its shorter version:

pip install -e .

Distributing compiled code

If you're working in C++ or Fortran, there is no language specific repository. You'll need to write platform installers for as many platforms as you want to support.

Typically:

  • dpkg for apt-get on Ubuntu and Debian
  • rpm for yum/dnf on Redhat and Fedora
  • homebrew on OSX (Possibly macports as well)
  • An executable msi installer for Windows.

Homebrew

Homebrew: A ruby DSL, you host off your own webpage

See an installer for the cppcourse example

If you're on OSX, do:

brew tap jamespjh/homebrew-reactor
brew install reactor