jobs.rst

.. _jobs:

Example Jobs
------------

We mentioned job files as parameters to qsub in the last section. They are
a convenient way of collecting job properties without clobbering the command
line. It's also useful to programmatically create a job description and capture
it in a file.

Simple Jobs
+++++++++++

The simplest job file just consist of a list of shell commands to be executed.
In that case it is equivalent to a shell script.

Example ``simple_job.pbs``

.. code-block:: bash

   cd project/
   ./run_simulation.py


You can then submit that job with ``qsub simple_job.pbs``.


Now that is rarely sufficient. In most cases you are going to need some
resource requests and a state variable as you are very likely to submit
multiple similar jobs. It is possible to add qsub parameters (see
:doc:`resources`) inside the job file.

Example ``job_with_resources.pbs``

.. code-block:: bash

    #PBS -N myjob
    #PBS -l walltime=10:0:0
    #PBS -l mem=32gb
    #PBS -j oe
    #PBS -o $HOME/logs/
    #PBS -m n
    #PBS -d .

    ./run_simulation.py

This would create a job called **myjob** that needs **10 hours** of running time,
**32 gigabyte of RAM** and a **joined** stdout/stderror stream stored in a folder
called **$HOME/logs/**. It will **not send any e-mails** and start in the **current
directory**.

Interactive Jobs
++++++++++++++++

Sometimes it may be useful to get a quick shell on one of the compute nodes.

Before submitting hundreds or thousands of jobs you might want to run some
simple checks to ensure all the paths are correct and the software is loading
as expected. Although you can usually run these tests on the master itself there are
cases when this is dangerous, for example when your tests quickly require lot's
of memory. In that case you should move those tests to one of the compute nodes:

.. code-block:: bash

   qsub -I

This will submit a job that requests a shell. The submission will block until the job gets scheduled. When there are lot's of jobs in the queue the scheduling might take some time. To speed things up you can submit to the testing queue which only allows jobs with a very short running time: Example:

.. code-block:: bash

    krause@master:~> $ qsub -I -q testing
    qsub: waiting for job 4465022.master.tardis.mpib-berlin.mpg.de to start
    qsub: job 4465022.master.tardis.mpib-berlin.mpg.de ready

    krause@ood-9:~> $


.. _torque_job_wrappers:

Job Wrappers
++++++++++++

Another common pattern is to create a script that would loop over some input
space and then create a job-file line-by-line and submit it at the end of each
loop run. Creating a file line-by-line can be done with the file redirection
operators ``>`` and ``>>``.

To **delete** a file **and then write** a single line to a file ``tmp.file`` you could:

.. code-block:: bash

   echo "this is some line" > tmp.file

To **append** to an existing file you can use ``>>``:

.. code-block:: bash

   echo "this is a second line" >> tmp.file


Using this operator a **job creation wrapper** could look like this:


.. code-block:: bash

    #!/bin/bash

    for input_file in INPUT/* ; do
        echo "#PBS -m n"                          > tmp.pbs
        echo "#PBS -o /dev/null"                 >> tmp.pbs
        echo "#PBS -j oe"                        >> tmp.pbs
        echo "#PBS -l walltime=96:0:0"           >> tmp.pbs
        echo "#PBS -d ."                         >> tmp.pbs

        echo "./run_analysis.py $input_file"     >> tmp.pbs
        qsub tmp.pbs
    done

    rm -f tmp.pbs

A different syntax to get exactly the same thing:

.. code-block:: bash

    #!/bin/bash

    cat > tmp.pbs <<EOF
    #PBS -m n
    #PBS -o /dev/null
    #PBS -j oe
    #PBS -l walltime=96:00:00
    #PBS -d .

    ./run_analysis.py %VAR1%
    EOF

    for input_file in INPUT/* ; do
        sed "s/%VAR1%/${input_file}/" tmp.pbs | qsub
    done

    rm -f tmp.pbs

.. _job_array:

Environment Variables
++++++++++++++++++++++

There are a number of **environment variables** available to each job, for instance:

.. code-block:: bash


    krause@ood-32:~> $ env | grep PBS | sort
    PBS_ARRAYID=1
    PBS_ENVIRONMENT=PBS_BATCH
    PBS_JOBCOOKIE=22E3D79E015B9F5EE9E205EBF8CB64E7
    PBS_JOBID=4294864-1.master.tardis.mpib-berlin.mpg.de
    PBS_JOBNAME=STDIN-1
    PBS_MOMPORT=15003
    PBS_NODEFILE=/var/spool/torque/aux//4294864-1.master.tardis.mpib-berlin.mpg.de
    PBS_NODENUM=0
    PBS_NUM_NODES=1
    PBS_NUM_PPN=1
    PBS_O_HOME=/home/mpib/krause
    PBS_O_HOST=master.tardis.mpib-berlin.mpg.de
    PBS_O_LANG=de_DE.UTF-8
    PBS_O_LOGNAME=krause
    PBS_O_MAIL=/var/mail/krause
    PBS_O_PATH=/home/mpib/krause/bin:/opt/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
    PBS_O_QUEUE=route
    PBS_O_SHELL=/bin/bash
    PBS_O_WORKDIR=/home/mpib/krause
    PBS_QUEUE=default
    PBS_SERVER=master
    PBS_TASKNUM=1
    PBS_VERSION=TORQUE-2.4.16
    PBS_VNODENUM=0

Those variables are dynamic to a job and can be used in your scripts. You could
create a **unique directory or file** based on ``$PBS_JOBID`` for example.
Another extremely useful feature is ``$PBS_ARRAYID``. When you submit a job
array with the ``-t`` parameter this variable holds the current index. So you
could easily run multiple unique jobs like this:

.. code-block:: bash

   echo './run_simulation $PBS_ARRAYID' | qsub -t 1-100 -d .

This will create 100 jobs with indices between 1 and 100, each starting in the
current working directory. **Note** the use of the **single quotes** here. This
is important, as you do not want to evaluate the variable ``$PBS_ARRAYID`` at
the time of submission but at the time of execution!

Alternatively you can quote the ``$`` character like this:

.. code-block:: bash

   echo "./run_simulation \$PBS_ARRAYID" | qsub -t 1-100 -d .