Python Environment Modules
Raijin, NCI’s supercomputer, like many HPC environments uses Environment Modules to provide access to programs and libraries. Using Modules allows different users on the supercomputer to access different versions of compilers and libraries without having them conflict or messing about with PATH environment variables. We’ve been using environment modules in the ACCESS area for a while now to provide access to Climate focused tools and libraries, and have built up a decent stash of Python libraries.
NCI themselves don’t support Python libraries in their central directory, primarily because managing dependencies between them can be a pain. As Python is getting used increasingly frequently in climate science however it’s important for us to support them as part of ACCESS. I wanted a system where installing new libraries from PyPI, the central Python package index, was as easy as running:
To do this, the install script needed to download & install the library to a custom path as well as set up the Modulefile needed to load the library into a user’s environment.
Libraries are installed into a path that looks like
apps/pythonlib/$LIBRARY/$VERSION
, so that multiple versions of a library can
be supported at the same time. By default the latest version is installed, but
I don’t know what version that is without doing the installation. Happily PyPI
provides an API to get what versions of a library are available, you can get
the most recent with a script like:
Knowing the version number, it’s then a simple matter of using easy_install
to perform the installation:
The Modulefile used by Environment Modules is written in a domain specific
language based on TCL. Commands let you modify environment variables, primarily
PATH
variables used when searching for commands.
A basic Modulefile might look like:
This adds the library path to the environment variables PATH
and
PYTHONPATH
. Since Environment Modules are independent of the shell you’re
using the same file works for both Bash and Csh users.
For the ACCESS modules I’ve also created a generic include file, which sets up
paths for the standard bin
, include
, lib
directory setup as well as
creating help text for the module help
command. This looks like:
Of note are the multiple variables used for the library path. This makes sure
both static and dynamic libraries are found, while setting LD_RUN_PATH
means
the linker will save the full path to shared libraries. This means that the
module doens’t need to be loaded to run programs.
Having this setup means that we can quickly respond to requests for new installs - provided the library is listed on PyPI our response is pretty much instant. The full list of Python modules we have available can be found on the ACCESS wiki.