Commit da3c8ec6 authored by Michael Krause's avatar Michael Krause 🎉
Browse files

pbs: add note about dependencies

parent 3f224fb6
Pipeline #2824 passed with stages
in 42 seconds
......@@ -60,6 +60,7 @@ List of Contents
pbs/jobs
pbs/commands
pbs/resources
pbs/dependencies
.. toctree::
:maxdepth: 1
......
Job Dependencies
================
In certain scenarios it may be useful to submit jobs or chunks of jobs which depend on each other. For example, consider this three-stage pipeline:
.. image:: ../img/job_stages.svg
In a first stage there are four jobs with ids 1 through 4. These four jobs can
be run in parallel. After that and only when all of those jobs finished,
another job (id 5) runs on the results of that first stage to produce some
intermediate value in stage two. With that value another set of four independent
jobs with ids 6 through 9 can run in a third stage.
While it's certainly possible to just split these three stages and submit them
manually one after the other, Torque and other PBS environments expose
dependency systems to specify the structure of the three stages so you only
have to submit once.
Submit Example
--------------
Every time you submit a job with qsub you can add a dependency list to that job
and ask to only run on certain conditions. For instance, to run job 5 only
after jobs 1 to 4 finished we could do something like this:
.. code-block:: bash
$ qsub stage1_job.pbs
1.tardis.mpib-berlin.mpg.de
[...]
$ qsub stage1_job.pbs
4.tardis.mpib-berlin.mpg.de
$ qsub stage2_job.pbs -W depend=afterany:1:2:3:4
5.tardis.mpib-berlin.mpg.de
[...]
The syntax is always ``-W depend=<type><comma-seperated-list-of-job-ids>``. You
can extend the dependency graph as you like. In the example case we could
submit stage 3 right after we submitted stage 2. Note that you always have to
use/save the correct job ids and they are not predictable, as you never know
what other users might submit jobs in the meantime.
Because of that fact a common pattern is to capture the job id after each qsub with ``id=$(qsub ...)`` and dynamically construct an array of dependencies to be used in the following stage:
.. code-block:: bash
$ # stage 1
$ DEP=""
$ for i in {1..4} ; do DEP="$DEP:$(qsub stage1_job.pbs)"; done
$ # stage 2
$ DEP=$(qsub stage2_job.pbs -W depend=afterany$DEP)
$ # stage 3
$ for i in {1..4} ; do qsub stage3_job.pbs -W depend=afterany:$DEP ; done
Note that the leading ``:`` in the second stage is a side effect of the array
construction (``$DEP`` will start with a ``:`` in the first iteration).
.. important::
The dependency type is able to parse return values of all job dependencies
by using ``afterok`` (a return value of 0 / success), ``afternotok`` (any
non 0 value). For that to work the jobs have to return a proper exit
status! If you're unsure or if you don't care about return values, use
``afterany`` as shown in the example.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment