Newer
Older
using GPUs
----------
With the release of SLURM we introduced a number of specific nodes with two
flavors of Nvidia GPUs attached to them to be used with CUDA-enabled code.
Right now we have these nodes available:
======== =============== ====== ===== ========= ============
Nodename GPU Type Memory Count Partition Architecture
======== =============== ====== ===== ========= ============
gpu-1 GTX 1080 TI 12 GB 2 test Pascal
-------- --------------- ------ ----- --------- ------------
gpu-2 GTX 1080 8 GB 3 gpu Pascal
-------- --------------- ------ ----- --------- ------------
gpu-3 GTX 1080 8 GB 3 gpu Pascal
-------- --------------- ------ ----- --------- ------------
gpu-4 Quadro RTX 5000 16 GB 4 gpu Turing
-------- --------------- ------ ----- --------- ------------
gpu-5 Quadro RTX 5000 16 GB 4 gpu Turing
-------- --------------- ------ ----- --------- ------------
gpu-6 Quadro RTX 5000 16 GB 4 gpu Turing
-------- --------------- ------ ----- --------- ------------
gpu-7 Quadro RTX 5000 16 GB 4 gpu Turing
======== =============== ====== ===== ========= ============
Both the 12GB 1080 TI and the 8GB 1080 are grouped under the name **pascal**. The
short name for the more powerful Quadro cards is **turing**.
To request any GPU, you can use ``-p gpu --gres gpu:1`` or ``-p test --gres
gpu:1`` if you want to test things. The ``gres`` parameter is very flexible and
allows to request the GPU group/architecture (**pascal** or **turing**).
For example, to request 2 Geforce 1080, use ``--gres gpu:pascal:2``. This will
effectively hide all other GPUs and grants exclusive usage of the devices.
You can use the `nvidia-smi` tool in an interactive job or the node-specific
charts to get an idea of the device's utilization.
Any code that supports CUDA up to version 10.1 should just work out of the box,
that includes python's pygpu or Matlab's gpu-enabled libraries.
.. note::
It is also possible to pass a requested GPU into a **singularity
container**. You have to pass the ``--nv`` flag to any
singylarity calls, however.
Example: Request an interactive job (``srun --pty``) with 4 cores, 8 GB of
memory and a single card from the *turing* group. Instead of ``/bin/bash`` we use
the shell from a singularity container and tell singularity to prepare an
Nvidia environment with ``singularity shell --nv``:
srun --pty -p gpu --gres gpu:turing:1 -c 4 --mem 8gb \
singularity shell --nv /data/container/unofficial/fsl/fsl-6.0.4.sif
Singularity> hostname
gpu-4
Singularity> nvidia-smi
Tue Jul 14 18:38:14 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.74 Driver Version: 418.74 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:3B:00.0 Off | Off |
| 33% 28C P8 10W / 230W | 0MiB / 16095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
+-----------------------------------------------------------------------------+
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Singularity>