Python ====== Python is probably the most versatile programming ecosystem available on the Tardis. It's an interpreted language that is easy to read, quick to prototype in and comes with a huge number of useful libraries. Versions -------- Users need to be aware, that there are still two major versions around that are not 100% compatible to each other. Although Python 3.0 was released back in 2008, its predecessor Python 2.7 is still around and not all libraries and packages have migrated. Since the system version on Debian/stable is still 2.7, the :file:`python` program always points to Python 2 (2.7). The :file:`python3` program designates the counterpart (3.5). The system and core python libraries are, though stable, still quite old (even in the Python 3 branch). You can use environment modules to switch to a newer version. Example: .. code-block:: bash # system libraries [krause@master ~] python --version Python 2.7.13 [krause@master ~] python3 --version Python 3.5.3 # load new version for python 3 [krause@master ~] module load python/3.6 [krause@master ~] python3 --version Python 3.6.3 Packages -------- Some important packages are already installed system-wide. To see if they are available, simply try to import them: .. code-block:: python [krause@master ~] python3 Python 3.5.1 (default, Mar 14 2016, 16:32:54) [GCC 4.7.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import lxml >>> lxml.__path__ ['/opt/software/python/3.5/lib/python3.5/site-packages/lxml'] If you happen to need a more recent one or something that hasn't been installed already, the easiest way is to use the default Python package manager :file:`pip`. Of course it comes separately for Python 2 and 3: .. code-block:: bash [krause@master ~] pip --version pip 9.0.1 from /usr/lib/python2.7/dist-packages (python 2.7) [krause@master ~] pip3 --version pip 9.0.1 from /opt/software/python/3.6.3/lib/python3.6/site-packages (python 3.6) To install a package in your home directory, you can simply run :program:`pip3 install --user `. Example (install numpy): .. code-block:: bash [krause@master ~] pip3 install --user numpy Collecting numpy Downloading numpy-1.11.2-cp35-cp35m-manylinux1_x86_64.whl (15.6MB) 100% |████████████████████████████████| 15.6MB 58kB/s Installing collected packages: numpy Successfully installed numpy-1.11.2 [krause@master ~] python3 Python 3.5.1 (default, Mar 14 2016, 16:32:54) [GCC 4.7.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__, np.__path__ ('1.11.2', ['/home/mpib/krause/.local/lib/python3.5/site-packages/numpy']) Remember to always install on the tardis master node in case a package needs special development files which aren't installed on the computation hosts. .. note:: The Python Package Index (`PyPI`_), pip's source, contains a lot of user provided, custom and sometimes old and unstable software. Make sure that what you're installing is actually the package that you want. Usually the project's installations notes tell you what the package is called in pypi. Virtual Environments -------------------- Sometimes, especially for reproducibility reasons, it may be useful to freeze the python package versions to a specific release. You could create a :file:`requirements.txt` with version numbers and then always run `pip3 install --user -r requirements.txt`, when you switch a project, but it's way more convenient to use different python environments for each project. Starting with Python 3.4 the program `pyvenv`_ will help you manage different environments. Once created it will copy the current system version's python and pip to a new directory with the environment's name. Every package installed or upgraded will be contained to that specific directory and you can switch between them very easily. For convenience reasons we also installed a module called `virtualenvwrapper`_, which provides three important commands to handle environments: `mkvirtualenv`, `workon`, and `deactivate`. **Create a new virtual environment** .. code-block:: bash [krause@master ~] mkvirtualenv --python=$(which python3) project (project) [krause@master ~] **Activate and use the virtual environment** .. code-block:: bash # without virtual environment [krause@master ~] python --version Python 2.7.13 # with virtual environment [krause@master ~] workon project (project) [krause@master ~] python --version Python 3.5.3 (project) [krause@master ~] which python3 /home/mpib/krause/.virtualenvs/project/bin/python (project) [krause@master ~] pip --version pip 9.0.1 from /home/mpib/krause/.virtualenvs/project/lib/python3.5/site-packages (python 3.5) (project) [krause@master ~] pip install numpy Collecting numpy Using cached https://files.pythonhosted.org/packages/fe/94/7049fed8373c52839c8cde619acaf2c9b83082b935e5aa8c0fa27a4a8bcc/numpy-1.15.1-cp35-cp35m-manylinux1_x86_64.whl Installing collected packages: numpy Successfully installed numpy-1.15.1 **Deactivate the virtual environment** .. code-block:: bash (project) [krause@master ~] deactivate [krause@master ~] Note how :file:`virtualenv` is also managing your shell prompt, so you always know which Python environment you are currently running. All your virtual environments created this way reside in your home directory under :file:`~/.virtualenvs/`. In theory you could just run the virtual python interpreter that is installed in :file:`~/.virtualenvs//bin/python` directly. It is much more convenient to use the wrapper functions though. .. important:: To use the virtualenvwrapper convenience functions (workon etc) in a torque job file you need to add one of the following lines to your job definitions: :file:`source /etc/bash_completion` **or** :file:`module load virtualenvwrapper` Conda ----- Another approach to virtual environments (and a whole virtual operating system in fact) is provided by a third party, commercial python distribution called `Anaconda`_ (:file:`conda`). Though discouraged for smaller projects, you can use an environment module to load and activate a (mini)conda distribution on the Tardis: .. code-block:: bash [krause@master ~] module avail conda -------- /opt/environment/modules -------- conda/4.7.10 [krause@master ~] module load conda [krause@master ~] conda -V conda 4.7.10 Once loaded, just like with `pyvenv` or `virtualenv`, you can create and manage multiple conda environments and keep specific python versions and their library dependencies in it. Note however, that conda will also download and manage a large number of system libraries, which *may* make bugs very hard to debug and could lead to unexpected reproducibility issues. Some software however can only be installed with conda and I strongly recommend to limit the use of conda for those specific projects. One example of those projects is Theano and its optional dependency pygpu. To install Theano (or other conda-only packages) you can create a new environment: .. code-block:: bash [krause@master ~] module load conda # activate conda itself [krause@master ~] conda create --yes --name theano Collecting package metadata (current_repodata.json): done Solving environment: done [...] [krause@master ~] conda activate theano # activate a conda env (theano) [krause@master ~] # now you can install packages into the env (theano) [krause@master ~] conda install --yes numpy scipy mkl [...] (theano) [krause@master ~] conda install --yes theano pygpu (theano) [krause@master ~] which python /home/beegfs/krause/.conda/envs/theano/bin/python (theano) [krause@master ~] python Python 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import theano as t WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions. >>> To deactivate (and possibly remove) an existing conda environment, run: .. code-block:: bash (theano) [krause@master ~] conda deactivate [krause@master ~] # deactivated, safe to remove [krause@master ~] conda remove --yes --name theano --all Remove all packages in environment /home/beegfs/krause/.conda/envs/theano: [...] .. _PyPI: https://pypi.python.org/pypi .. _pyvenv: https://virtualenvwrapper.readthedocs.io/en/latest .. _virtualenvwrapper: https://packaging.python.org/installing/#creating-virtual-environments .. _Anaconda: https://docs.conda.io/projects/conda