commands.rst 2.49 KB
Newer Older
Michael Krause's avatar
Michael Krause committed
1
2
3
Commands
--------

4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
Submitting
+++++++++++

**Non-Interactively**

Add a single to job the the default queued:

.. code-block:: bash

   sbatch job.slurm

Submit the same job with some resource requests and a name.

.. code-block:: bash

   sbatch --cpus 2 --mem 8G --job-name test job.slurm

Submit a job to the gpu partition, requesting 2 gpus on a single node:

.. code-block:: bash

    sbatch -p gpu --gres gpu:2 job.slurm

Wrap bash commands into a job on the fly:

.. code-block:: bash

    sbatch --wrap "module load R ; Rscript main.R"

**Interactively/Blocking**

Quick interactive, dual-core shell in the test partition:

.. code-block:: bash

    srun -p test -c2 --pty bash

Querying
++++++++

You can use ``squeue`` to get all information about queued or running jobs. This
example limits the output to jobs belonging to the user `krause`:

.. code-block:: bash

    [krause@master ~] squeue  -u krause
                 JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                110996     short     test   krause  R       0:12      1 ood-43
                110997       gpu job.slur   krause  R       0:08      1 gpu-4
                110995     short job.slur   krause  R       0:15      1 ood-43

As you can see there are 3 jobs, two of them are in the default partition
(**short**) and one has been sent to the gpu partition. They are all in the
running (R) state (ST) and have been running for a couple of seconds (TIME).
**Squeue** is very powerful and its output can be arbiatrarily configured using
format strings.  Checkout ``squeue -o all`` and have a look at the manpage with
``man squeue``.

To get live metrics from the job you have to use 

To look up historical (accounting) data there is ``sacct``. Again, all output columns can be configured. Example:

.. code-block:: bash

    [krause@master ~] sacct -o JobID,ReqMEM,MaxRSS,CPU,Exit
           JobID     ReqMem     MaxRSS    CPUTime ExitCode
    ------------ ---------- ---------- ---------- --------
    110973              4Gc       936K   00:00:08      0:0
    110974              4Gc       936K   00:00:00      0:0
    110976              4Gc       944K   00:00:03      0:0
Michael Krause's avatar
Michael Krause committed
74

75
76
Deleting
++++++++
Michael Krause's avatar
Michael Krause committed
77
78
79
80
81
82
83

You can cancel a specific job by running ``scancel JOBID`` or all of
your jobs at once  with ``scancel -u $USER``. This is a bit
different to Torque as there is no special **all** placeholder.
Instead you just ask the system to cancel jobs matching to your
username. Of course it's not possible to accidentally cancel other
user's jobs.