Distributing Python Libraries, Part 2
In my last post I talked about how to
set up a new python library by creating a new source repository and the
setup.py
script that pip
uses to install it. In this post I’ll talk about
setting up testing for the library, and automating code quality checks using
services like Travis-ci and Landscape.
You can see the end results at https://github.com/coecms/ARCCSSive
Testing
There are quite a few libraries around for testing Python code. I like to use
py.test, as it doesn’t require much boiler-plate code to use, you just
write functions with an name starting with test_
:
Testing is important as it’s what lets you know your code is doing what it’s supposed to be doing
A useful way to go about creating tests for your code is to do what’s called ‘test-driven’ development. In this method you write tests before the code, which encourages you to think about how the code’s actually going to be used.
For ARCCSSive, I wanted users to be able to easily select experiments from the
CMIP5 database. My idea was to have a query()
function that would take
various attributes to search for, and would return something that users could
loop over to get the output files:
Py.test will by default search the tests
directory for functions beginning
with test_
, then run them. The assert
statement will report an error if
it’s argument is False
or None
, I’m using it as a way to check functions
are returning something.
At this stage I’m not saying what the return values of the functions should be,
just the properties that I’d like them to have. If I run py.test
now it will
collect all of the tests and run them for me, reporting any failures:
The first error is easy to fix - I don’t have a CMIP5
module yet. From here
it’s a process of writing code to make the tests pass. Once the tests are
passing we can commit both the code and tests to the repository, then add a new
test for the next bit of functionality we need. As we add functionality to the
library we know from the tests that existing functionality isn’t being broken.
You can also add new tests if a bug is reported to ensure that it’s fixed in
the next release.
Automation
Once you’ve got some tests set up it’s a good idea to automate them, so that they are run whenever you commit code. travis-ci is a service that links up with Github to run your tests for you, it will also add information to branches and pull requests on Github saying if tests are passing and send you an email if tests start failing.
Travis is controlled by a file .travis.yml
in the top level of your
repository. This file tells Travis how to install and run your tests as well as
what language versions to use. Travis will build your code on a virtual machine
that’s deleted when the run has finished, so you’re free to install your own
packages using pip
(or apt-get
, though you’ll need to request sudo
as well)
This file says to run two sets of tests, using Python versions 2.7 and 3.4.
This is a good way to make sure your library is compatible with Python 3
without needing to install extra libraries on your own computer. Nowdays
supporting Python 2 and 3 at the same time is pretty simple, but there are some
corner cases that can catch you if you don’t test regularily. With each Python
version Travis will run the install
commands to install the package and some
testing libraries, then it will run the script
commands and report any
errors.
With the .travis.yml
file in your repository you can log into Travis with
your Github account, then activate any repositories you want to test. Travis
will run whenever you add a commit or someone makes a pull request.
Coverage
You’ll notice that the Travis script above isn’t running py.test
directly,
instead it’s using a program called coverage
. This is a helper function that
measures what lines in your code are actually being run, so that you can see
where you’re missing tests. The information is uploaded to a service called
codecov to produce a nice view of the test coverage. Codecov can also make
suggestions on what functions you could add tests to, and gives you a badge to
add to your README.md
to show the percent of code you’re testing.
Code Quality
Another helpful web service to use with Python libraries is landscape,
which is an automated code review tool. Landscape checks your code’s quality,
for instance making sure it adheres to the PEP 8 standard and that
functions aren’t too large or complex. It’s not neccessary to follow all of its
suggestions, but they can help make sure your code is understandable to others.
Like Travis you can activate your repository on Landscape using your Github
account, and it also gives you a badge for your README.md
.
Next Steps
Now that our code is working, tested and reasonably readable the next step is to put together some documentation. In the next post I’ll talk about using Sphinx to semi-automatically document your code, and how to upload the documentation to ReadTheDocs.