While there are many python packages/programs/libraries installed on the CIP pool machines, the ones you need may be not (like tensorflow or PyTorch). Or the version of the ones installed differ from the ones you need. In this case we advise the use of virutal environments. Though, in case you only need a singular package which does not have many dependencies, it may be enough to install the package into your home directory directly. This can be achieved with

[ga45can@cip2ryzen2] ~ $ pip install --user numba-scipy
Collecting numba-scipy
Downloading numba_scipy-0.3.0-py3-none-any.whl (7.4 kB)
Requirement already satisfied: scipy<=1.6.2,>=0.16 in /usr/lib/python3/dist-packages (from numba-scipy) (1.3.3)
Requirement already satisfied: numba>=0.45 in /usr/lib/python3/dist-packages (from numba-scipy) (0.48.0)
Installing collected packages: numba-scipy
Successfully installed numba-scipy-0.3.0

The package will be located in $HOME/.local/lib/pythonX.X/site-packages/ and you can import it in your python scripts as usual. If the packages comes with a standalone executable, it will be located in $HOME/.local/bin.

Basic Setup

We will briefly describe how to setup a virtual environment manually. There are several programs which will manage virtual environments for you, like pipenv or conda, but they may not be flexible enough, depending on your use case. We also strongly advise to create your virtual environments on the scratch partion of the host your are using. If you need to install many packages, the virutal environment will be very big and you may run out of disk space in your home directory.

Firstly, create a folder for your project like

[ga45can@cip2ryzen3] ~ $ mkdir -p /scratch/ga45can/My_Awesome_Python_Project

Secondly, create a hidden folder inside your project folder for the virutal environment. This step is not necessary, but it helps to keep your project folder clean. The folder can have an arbitrary name.

[ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ mkdir .venv

Now we can create the virutal environment

[ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ virtualenv .venv/
created virtual environment CPython3.8.10.final.0-64 in 360ms
creator CPython3Posix(dest=/scratch/ga45can/My_Awesome_Python_Project/.venv, clear=False, global=False)
seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/stud/ga45can/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

and enter it with

[ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ source .venv/bin/activate
(.venv) [ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $

The change of the command prompt indicates we successfully entered the environment. Now we can install all the packages we desire, like

(.venv) [ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ pip install tensorflow
Collecting tensorflow

Workaround for large packages

In case you do not have much space left in your home directory and you need to install large packages like PyTorch (~800MB), the install might fail because pip is caching the downloading package before the install in $HOME/.cache. To get around this problem you could symlink your cache directory to some directory you create on the scratch parition or we tell pip not to cache the downloading package in you home directory. To achieve that we need to overwrite the environment variable TEMPDIR temporarily for the install like

(.venv) [ga45can@cip2ryzen2] /scratch/ga45can/My_Awesome_Python_Project $ TMPDIR=/scratch pip install torch

Jupyter in virtual environments

Jupyter notebooks are great way for developing python code. How to access notebooks running on the CIP machines on your own machine is explained here. However, if you want to use the python packages you installed in a virutal enviroment you need to make them visiable to Jupyter. There are two possibilities.

Install Jupyter into the virtual environment

This is the easiest option. Enter your virtual environment, install Jupyter and start the notebook server, i.e.

[ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ source .venv/bin/activate
(.venv) [ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ pip install jupyter
Collecting jupyter
(.venv) [ga45can@cip2ryzen3] /scratch/ga45can/My_Awesome_Python_Project $ jupyter-notebook

The default kernel (Python 3) is pointing to your virtual environment automatically.

Install IPython kernel pointing to your virtual environment

If you do not want to enter your virtual environment every time you want to start working on your code or if you have multiple environments you want to access from the same notebook, this option may be more convienient. Enter the virtual environment you want to access with jupyter and install the ipykernel package:

[ga45can@tuphcom-cip1sandy8] /scratch/ga45can/My_Awesome_Python_Project $ source .venv/bin/activate
(.venv) [ga45can@tuphcom-cip1sandy8] /scratch/ga45can/My_Awesome_Python_Project $ pip install ipykernel
Collecting ipykernel

Now we can install a kernel pointing to this environment with

(.venv) [ga45can@tuphcom-cip1sandy8] /scratch/ga45can/My_Awesome_Python_Project $ python -m ipykernel install --user --name=MyAwesomePythonProject
Installed kernelspec MyAwesomePythonProject in /home/stud/ga45can/.local/share/jupyter/kernels/myawesomepythonproject

Now you can leave the environment and start the notebook server. There you should now be able to see a kernel pointing to your virtual environment. You can do this for as many enviroments as you need.

  • No labels