Skip to content
dependencies.rst 2.65 KiB
Newer Older
Job Dependencies
================

In certain scenarios it may be useful to submit jobs or chunks of jobs which depend on each other. For example, consider this three-stage pipeline:

.. image:: ../img/job_stages.svg


In a first stage there are four jobs with ids 1 through 4. These four jobs can
be run in parallel. After that and only when all of those jobs finished,
another job (id 5) runs on the results of that first stage to produce some
intermediate value in stage two. With that value another set of four independent
jobs with ids 6 through 9 can run in a third stage.

While it's certainly possible to just split these three stages and submit them
manually one after the other, Torque and other PBS environments expose
dependency systems to specify the structure of the three stages so you only
have to submit once.

Submit Example
--------------

Every time you submit a job with qsub you can add a dependency list to that job
and ask to only run on certain conditions. For instance, to run job 5 only
after jobs 1 to 4 finished we could do something like this:

.. code-block:: bash

   $ qsub stage1_job.pbs
   1.tardis.mpib-berlin.mpg.de
   [...]
   $ qsub stage1_job.pbs
   4.tardis.mpib-berlin.mpg.de
   $ qsub stage2_job.pbs -W depend=afterany:1:2:3:4
   5.tardis.mpib-berlin.mpg.de
   [...]

The syntax is always ``-W depend=<type><comma-seperated-list-of-job-ids>``. You
can extend the dependency graph as you like. In the example case we could
submit stage 3 right after we submitted stage 2. Note that you always have to
use/save the correct job ids and they are not predictable, as you never know
what other users might submit jobs in the meantime.

Because of that fact a common pattern is to capture the job id after each qsub with ``id=$(qsub ...)`` and dynamically construct an array of dependencies to be used in the following stage:

.. code-block:: bash

   $ # stage 1
   $ DEP=""
   $ for i in {1..4} ; do DEP="$DEP:$(qsub stage1_job.pbs)"; done
   $ # stage 2
   $ DEP=$(qsub stage2_job.pbs -W depend=afterany$DEP)
   $ # stage 3
   $ for i in {1..4} ; do qsub stage3_job.pbs -W depend=afterany:$DEP ; done

Note that the leading ``:`` in the second stage is a side effect of the array
construction (``$DEP`` will start with a ``:`` in the first iteration).

.. important::

    The dependency type is able to handle return values of all job dependencies
    by using either ``afterok`` (a return value of 0 - success), or
    ``afternotok`` (any non 0 value - indication for an error). However, for
    that mechanism to work all jobs have to return a proper exit status! If
    you're unsure or if you don't care about return values, use ``afterany`` as
    shown in the example.