Skip to content

HPC job submission guidelines

The following gives more information on how to submit jobs to LSF, including some advanced usage guides.

Batch jobs

Usage:

bsub -q <queue> -o <_/path/jobout.log_> <_myjob_>

This will submit a job to the queue, -o flag is used to log job output. It is recommended to log job output using -o option for batch jobs.

Various resource strings example

Job requirement(s) -R option syntax
reserved 1 GB of memory for my job bsub -R ‘rusage [mem=1000]’ <myjob>
reserved 1 GB of memory for my job AND on a single host bsub -R ‘rusage [mem=1000] [hosts=1]’ <myjob>
nodes sorted by cpu and memory and reserved 1 GB memory bsub -R “order[cpu:mem] rusage[mem=1000]” <myjob>
nodes ordered by CPU utilisation (lightly loaded first) bsub -R "order[ut]" <myjob>
multi-core jobs (e.g. four cpu cores on single host) bsub -n 4 -R "span[hosts=1]" <myjob>

Multicore jobs

Sometimes you need to control how the selected processors for a parallel job are distributed across the hosts in the cluster.

You can control this at the job level or at the queue level. The queue specification is ignored if your job specifies its own locality

By default, LSF does allocate the required processors for the job from the available set of processors.

A parallel job may span multiple hosts, with a specifiable number of processes allocated to each host. A job may be scheduled on to a single multiprocessor host to take advantage of its efficient shared memory, or spread out on to multiple hosts to take advantage of their aggregate memory and swap space.The span string supports the following syntax:

span[hosts=1] Indicates that all the processors allocated to this job must be on the same host - please note that available nodes have a maximum of four available slots each, so when specifying this flag please use a value not larger than four for the -n flag.

e.g.

bsub -q medium -n 4 -R "span[hosts=1]" <myjob>

This will allocate four cores on a single machine

bsub -q medium -n 12 -R ”span[ptile=4] rusage[mem=xx] -o /path/to/jobout <myjob>

This will allocate 12 cores spread across three nodes with xx amount of memory reserved for the job.

Job dependencies

Sometimes, whether a job should start depends on the result of another job. To submit a job that depends on another job:

  1. use the -w option to bsub (it is lowercase w)
  2. Select dependency expression
  3. select dependency condition

Example 1:

1
2
3
4
5
6
7
8
bsub -q medium -R "rusage [mem=1000] span[hosts=1]" -J"dependent-1" -o ~/job.output myjob  
Job <9773> is submitted to queue <medium>

bsub -q medium -R “rusage [mem=1000] span[hosts=1 –J”dependent-2”-o ~/job.output -w 'done("dependent-1")' myjob
Job <9774> is submitted to queue <medium>

bsub -q medium -R “rusage [mem=1000] span[hosts=1] -o ~/job.output -w 'ended(9773)' myjob
Job <9775> is submitted to queue <medium>

Example 2:

bsub -J "dependency_1" -q medium -R ‘rusage [mem=1000] span[hosts=1 -o ~/lsf.output myjob
Job <9776> is submitted to queue <medium>

bsub -J "dependency_2" -q medium -R ‘rusage [mem=1000] span[hosts=1 -o ~/output -w 'exit("dependent_1")&&post_done("dependency_2")' myjob
Job <9777> is submitted to queue <medium>

bsub -J "dependent_3" -q medium -R ‘rusage [mem=1000] span[hosts=1 -o ~/lsf.output myjob
Job <9778> is submitted to queue <medium>

bsub -J "dependency_4" -q medium -R ‘rusage [mem=1000] span[hosts=1] -o ~/lsf.output -w 'exit("dependent_3")||post_done("dependent_3")' myjob

Application profile

LSF's application profile is used to define common parameters for the same type/similar resource requirement jobs and workflows, i.e. including the execution requirements of the applications, the resources they require, and how they should be run and managed.

It provides you a simpler way to job submission without having to extend all the job's resources requirement. For instance, to submit a job (for the sake of argument lets call the workflow name as nsv4) that requires 4 cpu core and 4 GB of memory you would normally submit like:

bsub -q medium -n 4 -R ”span[ptile=4] rusage[mem=4000]” -o /path/to/jobout <myjob>

With application profile, the job requirements are predefined in LSF config, so it would allow the submission as:

bsub -q medium-app <application name> -o /path/to/jobout <myjob>

Run bapp. You can view a particular application profile or all profiles define in the cluster. To see the complete configuration for each application profile, run bapp -l. If you require an application profile created for your job, please raise a ticket with the

  • preferred name of the profile
  • complete list of your jobs resources requirements (viz, number of cores, approximate memory, expected runtime, local disk requirement etc.)

Throttling jobs

If you are submitting large quantities of jobs and/or submitting jobs with long run time typically jobs that runs for hours and days, please be mindful of other users in the cluster. We strongly advise to throttle these jobs.

It means you can submit all jobs at once but control the number of concurrent RUNNING jobs at one go. There are a number of ways LSF allows this:

  • Control number of running jobs via job groups

To enable job throttling via job groups, You'll need the job groups to be created. We expect to raise an INFRA ticket to enable this with relevant information to create the job groups viz, job group name, who can access this job group, number of running jobs.

e.g. Here is an example of submitting jobs using the job group with name myjobgroup and running job limit of 50. You might submit as many jobs as you wish but LSF will restrict 50 jobs running at one time, then next 50 and so on ...

bsub _-q medium -g /bio/myjobgroup <rest of the submission>

  • Control number of running jobs via job array

e.g. To submit an array with 100 jobs and throttle into 10 running at once

_bsub -q medium –J “myArray[1-100]%10” <rest of the submission>

  • Control number of running jobs via policies/advance reservation/application profile

To enable this, you'll need to raise an INFRA ticket with relevant information to create the policies viz, policy/application profile name, who can access this policies, number of running jobs etc. Submission will be

via application profile

bsub -q medium -app myapp <rest of the submission>

via reservation ID

bsub -q medium -U <Reservation ID> <rest of the submission

Using scripts to submit your job

In order to avoid long commands, the arguments explained above can be combined on a bash script, which can then be submitted by only:

bsub < <myscript.sh>

Note the < sign above, indicating that your script is fed into the submission command.

The idea of <myscript.sh> is that arguments are now given on the header, where lines start with #, as an example:

#!/bin/bash
#BSUB -q <your_queue>
#BSUB -P <yourProject>
#BSUB -o <path_to/job.%J.out>
#BSUB -e <path_to/job.%J.err>
#BSUB -J <jobName>
#BSUB -R "rusage[mem=1000] span[hosts=1]"
#BSUB -n 2
#BSUB -cwd <"your_dir">

module load <moduleName>

script

To know what each of these options mean, read the documentation above and the introduction here. Note that you can add and remove arguments as needed.