Python Software Installation: Containerized Python Installations
Users can install Python packages and other software in containers. This provides complete control over the environment, guarantee that the environment will not be accidentally changed or updated (containers are immutable = can not be modified), and the containers can be archived or shared with others for reproducibility and consistency.
On this page, we describe the process of setting up Micromamba in a container. The process would be similar for other package managers, e.g. to the installation procedure described in the self-installed mamba/conda page.
Please note: While the environment management software described in this page has permissive licenses and can be used for free, channels (sources of packages) may have other license terms and may require commercial licenses. It is your responsibility to use an appropriate channel.
On this page
The table of contents requires JavaScript to load.
Micromamba in a Container: Installation and Use
Installing the whole mamba or conda environment in a container has several benefits:
- The environment is packaged in a single file, so it's easy to share and archive the whole environment.
- The whole environment is static (fixed during the build of the container), so it won't get changed accidentally when trying to install or update a package, as updates require building a new container.
- While it is equally possible to create a container based on Miniconda or Miniforge,
the Micromamba installation is smaller and is using the better performing mamba package manager. It also provides the
micromamba
command as a convenient wrapper for all the commands in the environment, which we utilize in the run command for the container.
Below, we outline steps in building a Micromamba container with some bioinformatics tools and a Jupyter Notebook as an example and use this environment on CHPC's Open OnDemand Jupyter app.
Creating a Micromamba Container
We use Apptainer to create the container. First, we create a recipe file for building the container,
an example of which is linked here, and name it Singularity
:
Bootstrap: docker
From: mambaorg/micromamba
%post
micromamba install --yes --name base -c bioconda -c conda-forge \
python=3.9.1 notebook samtools bwa
micromamba clean -aqy
%runscript
micromamba run -p /opt/conda "$@"
In this recipe, we are pulling the Micromamba container from DockerHub, installing
the needed tools in the post
section, and setting the micromamba run ...
command to execute whenever the container is executed.
Python environment packages can be also built into a container by specifying them
within an environment.yml
file, e.g.
channels:
-defaults
-conda-forge
dependencies:
-matplotlib
-python=3.9
-pip
For that, we can modify the micromamba install
command in the %post section as:
micromamba create --yes --name base --file environment.yml
Note that we are installing these packages in the base environment. We recommend to install separate container for each environment one want to set up, as installing and using virtual environments in the container would complicate the container setup and use.
We build the container by running the following code (for example, in a bash shell):
module load apptainer
unset APPTAINER_BINDPATH
apptainer build mymamba.sif Singularity
Unsetting the APPTAINER_BINDPATH is necessary to avoid a build error that complains
about missing mount points in the container. This environment variable ensures the
/scratch
and /uufs
file systems get mounted automatically when the container is executed.
The container .sif file has executable permissions, so we can run the container directly along with the command we want to run within the container:
$ ./mymamba_jup.sif bwa
Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.17-r1188
...
Micromamba GPU Container
For a container that interacts with GPUs, one has to use the --nv
flag during the container build, which imports the GPU stack from the host into the
container to ensure that the mamba package manager picks up the GPU/CUDA dependencies
and installs the GPU versions of programs like PyTorch. To see an example of a container
definition file that has the PyTorch environment installed, see the Singularity.gpu recipe that we discuss below.
We build the container by running the following code (for example, in a bash shell):
module load apptainer
unset APPTAINER_BINDPATH
apptainer build --nv mymamba_gpu.sif Singularity.gpu
To test that the correct GPU version of PyTorch is installed, we export the environment
variable APPTAINER_NV as an alternative to --nv
flag. This allows to directly run the container file and include the correct GPU
environment:
module load apptainer
export APPTAINER_NV=true
./mymamba_gpu.sif /opt/conda/bin/python -c "import torch; print(torch.cuda.is_available())"
True
The True
return from the torch.cuda.is_available()
function indicates that the GPU has been detected.
Using the Micromamba Container in Open OnDemand
Just as we have run the bwa command above, we can also run the Jupyter notebook command to start Jupyter. It is not recommended to run the Jupyter notebook command from the terminal, as it requires to create an SSH tunnel to the machine where we run the terminal to access Jupyter in our client's web browser. The Open OnDemand Jupyter app simplifies this problem greatly by launching Jupyter directly in the client's browser.
To run our container, we choose the "Custom (Environment setup below)" option for the "Jupyter Python version", and in the "Environment Setup for Custom Python" text box, put:
shopt -s expand_aliases
module load apptainer
alias jupyter="$HOME/containers/mymamba.sif jupyter"
The first command is a bash option to enable aliases in the shell script. We then
load the Apptainer module and follow this with creating an alias for the jupyter
command to call it from the container instead. This jupyter
alias is then passed to the Open OnDemand Jupyter app and launches the Jupyter server.
This alias is then used to run the jupyter
command inside of the Open OnDemand job to start the Jupyter server. Notice that
we use full path to the container, as the OpenOnDemand app starts at the base of user's
$HOME directory.
As noted above, if we need to use GPUs, we need to add the environment variable APPTAINER_NV=true
to initialize the GPUs in the container:
shopt -s expand_aliases
module load apptainer
export APPTAINER_NV=true
alias jupyter="$HOME/containers/mymamba_gpu.sif jupyter"