What is an HPC?¶

A High Performance Cluster (HPC) is a way to carry out large-scale analysis, using centralised compute.

flowchart TD
  A(Researcher submits job) --> B[Master host and candidates]
  B --> C[Queues]
  C --> |jobs wait in queues until the required resources are ready| D[Resources]
  D <-.-> |Master host and resources are in frequent communication| B
  D --> E(Job runs and finishes)
  classDef researcher fill:#DF007D,stroke:#DF007D,color:#FFFFFF;
  class A,E researcher;
  classDef RE fill:#FFC6E6,stroke:#FFC6E6,color:#2B2F3B;
  class B,C,D RE;

Overview of usage¶

To use the HPC, you start by logging onto the cluster. This brings you to the login node. From here you can cd into your working folder. It is possible to load and run software from the login node, but to make use of the full compute, you should launch jobs.

%%{init: {"flowchart": {"htmlLabels": false, 'curve': 'linear'}} }%%
flowchart TB
  subgraph "`RE`"
    direction TB
    B["Research Environment"] --> C["Terminal"]
  end
  subgraph "`HPC`"
    direction TB
    D["`**Login node** low resourced`"] -- "`create job`" --> E["`**Worker node** high resourced`"]
  end  
  subgraph "`Weka`"
    direction LR
    F[Weka storage] --> G["`**discovery_forum:** read/write folder for Industry Research Network members`"]
    F --> H["`**re_gecip:** read/write folder for Academic Research Network members`"]
    F --> I["`**genomes:** read only, contains all consented genomes`"]
    F --> J["`**public_data_resources:** read only, contains public resources, eg gnomAD`"]
    F --> K["`**gel_data_resources:** read only, contains GEL-generated datasets, eg AggV2`"]
  end
  A("`Researcher`") --> RE
  C -- "`ssh`" --> HPC
  HPC --> Weka
  RE --> Weka
  classDef node fill:#FFC6E6,stroke:#FFC6E6,color:#2B2F3B;
  class A,B,C,D,E,F,G,H,I,J,K node;

Terminology¶

Term	Meaning
LSF	Load Sharing Facility - the tool we use to schedule jobs on the HPC
CPU	Central Processing Unit, the main processors
Nodes	Ephemeral storage, networking, memory and processing resources that can be consumed by virtual machine instances. Sometimes referred to as `hosts`.
Job	A task that you run on the HPC. Jobs can spawn other jobs.
Queue	When you submit a job, it joins a queue. You can choose which queue to join, depending on the length of the job.
Batch jobs	A job that you set off, then it runs independently in the background
Interactive jobs	A job that opens access to the HPC, allowing you to run commands and tools on the cluster
Running	A job that is in progress
Pending	A job that is waiting in the queue
working directory	The folder where you put all your files.
standard output	Information about the job as it runs. If you were running a job normally, this would appear in the terminal, however on an HPC, you should set a file to write this to.
standard error	Information about errors from the job. If you were running a job normally, this would appear in the terminal, however on an HPC, you should set a file to write this to.
scratch	A location to write any temporary files creating during the job.
project code	researchers are grouped based on Research Network membership, with compute resources shared between the groups
modules	Software that has been loaded onto the HPC, which you can use in your analysis

Usage guidelines¶

DO

DO launch interactive jobs to run software on the HPC.

DO kill interactive jobs when you've finished with them.

DO choose the appropriate length queue for your job.

DO estimate the memory required for your job.

DO use scripts to launch your batch jobs.

DO set LSF parameters (#BSUB) within your scripts for improved traceability in your batch jobs.

DO specify the location for your standard output and error to help with troubleshooting

DO use scratch directories for your temporary files.

DO use containers to import software.

DO work with interactive coding tools such as Rstudio and Jupyter on the HPC.

DO set up .netrc to use the LabKey API.

DON'T

DON'T run software on the login node.

DON'T request more memory than you need.

DON'T keep your temporary files in folders that will be backed up.