Python Virtual Environment
- Why do we need a Virtual Environment
- Virtual Environment in Python 3.x
- Virtual Environment in Python 2.7.x
NOTE: While the Python Virtual Environment in CHPC installed Python still works, to install additional Python modules we recommend to use User installed Python via Miniconda or Anaconda.
Why do we need a Virtual Environment?
The CHPC actively maintains several versions of Python i.e.:
- python 2.7.11
- python 3.5.2
Within these distributions a whole array of packages are installed such as NumPy, SciPy, MatplotLib, IPython, etc.. However, there are Python packages which do not fit in the CHPC Python installation tree. Sometimes, CHPC users want to have the flexibilty to install/administer Python packages without having to install their own Python distribution.
The Python Virtual Environment realizes exactly those objectives. Users are allowed to create their own branch/twig (Virtual Environment) from an existing Python tree/main distribution. In this Virtual Environment, users can
- install their own packages
- use all the packages that have been installed in the main distribution
In the following paragraphs, we will explain:
- how to create a Virtual Environment based on Python 3.x (installed on the CHPC clusters).
- how to activate/deactivate a Virtual Environment
- the different ways to install packages in the Virtual Environment
The creation of a Virtual Environment under Python 2.7.* differs slightly from the setup of Python 3.x Virtual Environment. We will mention the differences in the last section.
Virtual Environment on Python 3.x
Creation of the Virtual Environment
The command below generates a Python Virtual Environment based on the Python 3.5.2 distribution. In the example below, the new Python distribution (branching from Python 3.5.2) will be installed in ~/VENV3.5.2
module load python/3.5.2
which python3 # Check if you have indeed /uufs/chpc.utah.edu/sys/installdir/python/3.3.6/bin/python
pyvenv --system-site-packages ~/VENV3.5.2
The flag '--system-site-packages' will allow you to use the packages (e.g. NumPy, SciPy, etc.) that are installed within the main Python distribution, which is recommended as these packages have been optimized for CHPC clusters. The newly created directory ~/VENV3.5.2 contains the subdirectories bin, include & lib.
The bin directory contains the python3 and python binaries. Both are symbolic links to the same python3 binary in the main branch. Note that the main branch has NO python binary.
Activation/Deactivation of the Virtual Environment
In order to work with the Virtual Environment, we need to activate it. This feature only works in the Bash Shell.
module unload python/3.5.2 #Unload the python executable from the original distribution
source ~/VENV3.5.2/bin/activate #for bash shell
source ~/VENV3.5.2/bin/activate.csh #for tcsh shell
When the above commands were executed succesfully, the command line will start with the (VENV3.5.2) string.
Invoking the command
which python
should result in the following output
~/VENV3.5.2/bin/python
The virtual environment can be deactivated, by typing:
deactivate
To remove the virtual environment, simply remove its directory:
rm -r ~/VENV3.5.2
Using packages installed within the main distribution within the Virtual Environment
With the Virtual Environment, we have access to all the packages that were installed in the main/original 3.5.2 Python distribution. In the following example we will use NumPy and MatplotLib that were installed in the main distro.
python
>>import sys
>>sys.path
>>import numpy as np>>np.__path__
['/uufs/chpc.utah.edu/sys/installdir/python/3.5.2/lib/python3.5/site-packages/numpy']
>>import matplotlib.pyplot as plt
>>np.__version__
>>a = np.linspace(0,3,61)
>>plt.plot(a,a**2)
>>plt.show()
How to install packages WITHIN the Virtual Environment
There are several ways to install packages in Python. We will discuss several approaches. This section of the documentation is paralleled by a short training video.
a. Installation using PIP
The easiest way to install packages is by using the python pip module. This approach works well for simple packages (i.e. packages that don't have dependencies). For packages that have dependencies, PIP may want to install dependencies which may mangle existing packages, such as numpy. Please, be aware of this and use PIP with caution. If in doubt, use setuptools described below.
The installation of the bibtexparser package is an excellent example. Through pip, the latest version of the bibtexparser (currently 0.6.1) will be automatically downloaded from https://pypi.python.org/pypi/bibtexparser/0.6.1 In the same operation, pip will install the package within the Virtual Environment.
python -m pip install bibtexparser
The pip module has some very useful options, e.g.:
python -m pip list # List ALL the packages which are installed in the main distribution or the Virtual Environment
python -m pip show numpy # Show information about the NumPy module
python -m pip search math # Searches for packages containg the 'math' string
python -m pip help # Show all the options
b. Installation using setuptools
Instead of using the pip module, we can retrieve the source code of the package (e.g. ChemPy) ourselves. After unzipping the source code, we can build & install the package using the setuptools package.
cd ~
wget https://pypi.python.org/packages/source/c/chempy/chempy-0.1.0.tar.gz
tar -zxvf chempy-0.1.0.tar.gz
cd chempy-0.1.0
python setup.py build
python setup.py install
cd .. ; rm -R chempy.0.1.0
Note that the command python setup.py install can be combined with the --prefix flag:
python setup.py install --prefix=$DIR
In this case, the module will be installed in the directory:
$DIR/lib/python3.5/site-packages
When the --prefix flag is not specified, DIR will be set to ~/VENV3.5.2.
Let's assume that a new Python package has been installed in a new directory DIR which is different from ~/VENV3.5.2 e.g. DIR=~/Trial. Then, we need to inform the Python executable where to find the new module. This can be done in 2 different ways:
- Adjust the PYTHONPATH at runtime
python
>>import sys
>>sys.path
>>sys.path.append('/uufs/chpc.utah.edu/common/home/$UNID/Trial/lib/python3.5/site-packages')
- Adjust/set the PYTHONPATH before invoking the python executable
export PYTHONPATH=/uufs/chpc.utah.edu/common/home/$UNID/Trial/lib/python3.3/site-packages:$PYTHONPATH
where $UNID stands for the unid of the user.
c. More advanced installations using setuptools
Sometimes, we need to install packages which depend on C, C++, Fortran libraries that are installed in non-standard locations. In e.g the case of C code, we may need to set the env variables CFLAGS & LDFLAGS to specify the locations of the header files and the libraries. The installation of the netCDF4 package is an excellent example:
cd ~
wget https://pypi.python.org/packages/source/n/netCDF4/netCDF4-1.1.9.tar.gz
tar -zxvf netCDF4-1.1.9.tar.gz
cd netCDF4-1.1.9
export HDF5_DIR=/uufs/chpc.utah.edu/sys/installdir/hdf5/1.8.14/
export NETCDF4_DIR=/uufs/chpc.utah.edu/sys/installdir/netcdf-c/4.3.2
export CFLAGS=" -I/uufs/chpc.utah.edu/sys/installdir/netcdf-c/4.3.2/include \
-I/uufs/chpc.utah.edu/sys/installdir/hdf5/1.8.14/include "
export LDFLAGS=" -Wl,-rpath=/uufs/chpc.utah.edu/sys/installdir/netcdf-c/4.3.2/lib \
-L/uufs/chpc.utah.edu/sys/installdir/netcdf-c/4.3.2/lib -lnetcdf \
-Wl,-rpath=/uufs/chpc.utah.edu/sys/installdir/hdf5/1.8.14/lib \
-L/uufs/chpc.utah.edu/sys/installdir/hdf5/1.8.14/lib -lhdf5 "
python setup.py build
python setup.py install
cd .. ; rm -R netCDF4-1.1.9
From the above command, it is clear that the Python netCDF4 package is dependent on the netcdf4 library, which is itself dependent on the hdf5 library. The '-Wl,-rpath=' flag followed by the library directory allows the newly created dynamic Python module to locate its library dependencies at runtime without the need to set the env. variable LD_LIBRARY_PATH.
Virtual Environment on Python 2.7.x
The Python Virtual Environment in Python 2.7 is slightly different from Python 3.x. In order to set up the virtual environment, the virtualenv package must have been installed in the main distribution.
Setting up a Virtual Environment
module load python/2.7.11
virtualenv --system-site-packages ~/VENV2.7.11
module unload python/2.7.11
Activation of the Virtual Environment
source ~/VENV2.7.11/bin/activate #for the bash shell
source ~/VENV2.7.11/bin/activate.csh #for the tcsh shell
The command line now start with the string (VENV2.7.11).
The command
which python
should now result in:
~/VENV2.7.11/bin/python
Installation using PIP
The bin subdirectory contains the pip executable. The installation
python -m pip install bibtexparser
can also be performed as follows:
pip install bibtexparser