Logo

HPC @ Uni.lu

High Performance Computing in Luxembourg

Introduction

When granted access to the UL HPC platform you will have at your disposal parallel computing resources.

Thus you will be able to run:

  • ideally parallel - OpenMP, MPI, CUDA, OpenCL jobs
  • however if your workflow involves serial tasks/jobs, you must run them efficiently

You should especially avoid submitting purely serial jobs when requesting a full node with OAR as you will waste computational power (11 out of 12 cores on Gaia!). Therefore, when requesting a full node, make sure you can run as many serial tasks at once as cores are available in the node.

You can find in our GitHub repository examples of launcher scripts that can be tailored to your own use case.

Use cases

Serial jobs

Described below is an OAR job script, comprising two examples of what an user should not do:

#OAR -l nodes=1
#OAR -n BADSerial
#OAR -O BADSerial-%jobid%.log
#OAR -E BADSerial-%jobid%.log
if [ -f /etc/profile ]; then
. /etc/profile
fi

# Example 1: run in sequence $TASK 1...$TASK $NB_TASKS
for i in ‘seq 1 $NB_TASKS‘; do
    $TASK $i
done

# Example 2: For each line of $ARG_TASK_FILE, run in sequence
# $TASK <line1> ... $TASK <lastline>
while read line; do
    $TASK $line
done < $ARG_TASK_FILE

Launch serial jobs in parallel

A better approach, launching the serial tasks all at once, would be:

# Example 1: run in sequence $TASK 1...$TASK $NB_TASKS
for i in ‘seq 1 $NB_TASKS‘; do
   $TASK $i &
done
wait

# Example 2: For each line of $ARG_TASK_FILE, run in sequence
# $TASK <line1> ... $TASK <lastline>
while read line; do
    $TASK $line &
done < $ARG_TASK_FILE
wait

Notice that the tasks are sent to run in background with &, and the wait command will ensure that all tasks have ended before ending the job.

Using GNU Parallel with serial jobs

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. It can be used in the UL HPC clusters to easily parallelize your tasks. You can see its full documentation and examples on the clusters with man parallel.

Example 1: run in sequence $TASK 1 … $TASK $NB_TASKS

# On a single node (run up to 12 tasks at once on the node)
seq $NB_TASKS | parallel -u -j 12 $TASK {}

# On many nodes (run up to 4 tasks at once on each node)
seq $NB_TASKS | parallel --tag -u -j 4 -sshloginfile ${GP_SSHLOGINFILE}.task $TASK {}

Example 2: For each line of $ARG_TASK_FILE, run in parallel $TASK … $TASK

# On a single node (run up to 12 tasks at once on the node)
cat $ARG_TASK_FILE | parallel -u -j 12 -colsep ’ ’ $TASK {}

# On many nodes (run up to 4 tasks at once on each node)
cat $ARG_TASK_FILE | parallel --tag -u -j 4 -sshloginfile ${GP_SSHLOGINFILE}.task -colsep ’ ’ $TASK {}

Parallel jobs (MPI):

You can compile, link and run your applications with different MPI implementations:

Using OpenMPI and GCC:

(node)$> module purge
(node)$> module load OpenMPI
(node)$> make clean && make
(node)$> mpirun -x LD_LIBRARY_PATH -hostfile $OAR_NODEFILE /path/to/mpi_prog

Using MVAPICH2 and GCC:

(node)$> module purge
(node)$> module load MVAPICH2
(node)$> make clean && make
(node)$> mpirun -hostfile $OAR_NODEFILE /path/to/mpi_prog

Using the Intel Cluster Toolkit Compiler Edition (ictce for short):

(node)$> module purge
(node)$> module load ictce
(node)$> make clean && make
(node)$> mpirun -launcher ssh -launcher-exec /usr/bin/oarsh -hostfile $OAR_NODEFILE /path/to/mpi_prog

Using MPICH 3/Clang:

(node)$> module purge
(node)$> module load MPICH
(node)$> make clean && make
(node)$> mpirun -hostfile $OAR_NODEFILE /path/to/mpi_prog

Application specific examples

MATLAB

Described below are examples of using MATLAB on the clusters, in interactive or passive (batch) mode. Interactive mode can be used for quick testing of MATLAB scripts or to run the full graphical environment for development, while long-running jobs should be run in passive mode.

To run the graphical MATLAB interface (e.g. on the Gaia cluster) in an interactive session:

# Connect to Gaia with X11 forwarding enabled:
(yourmachine)$> ssh access-gaia.uni.lu -X

# Request an interactive job (the default parameters get you 1 core for 2 hours):
(gaia-frontend)$> oarsub -I

# Load the (latest) MATLAB module:
(node)$> module load MATLAB

# Launch MATLAB, whose full interface will be displayed on your machine (may be slow due to the network):
(node)$> matlab

To run the text-mode MATLAB interface in an interactive session, for testing/quick executions:

# Connect to Gaia
(yourmachine)$> ssh access-gaia.uni.lu
(gaia-frontend)$> oarsub -I
(node)$> module load MATLAB

# Launch MATLAB with the graphical display mode disabled:
(node)$> matlab -nodisplay -nosplash
Opening log file:  /home/users/vplugaru/java.log.3258
                                                                    < M A T L A B (R) >
                                                          Copyright 1984-2013 The MathWorks, Inc.
                                                            R2013a (8.1.0.604) 64-bit (glnxa64)
                                                                     February 15, 2013
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
>> version()
ans =
8.1.0.604 (R2013a)
>> quit()

For non-interactive or long executions, MATLAB can be ran in passive mode, reading all commands from an input file you provide (e.g. named INPUTFILE.m) and saving the results in an output file (e.g. named OUTPUTFILE.out):

matlab -nodisplay -nosplash < INPUTFILE.m > OUTPUTFILE.out

The following minimal example shows how to run a serial (1 core) MATLAB script for 24 hours in passive mode:

(gaia-frontend)$> oarsub -l walltime=24:00:00 "source /etc/profile; module load MATLAB; matlab -nodisplay -nosplash < INPUTFILE.m > OUTPUTFILE.out"

Ideally you would not run MATLAB jobs like this but instead create/adapt a launcher script to contain those instructions:

# REMEMBER to change the following to the correct paths of the input/output files:
INPUTFILE=your_input_file.m
OUTPUTFULE=your_output_file.out
# Load a specific version of MATLAB and run the input script:
module load MATLAB/2013b
matlab -nodisplay -nosplash < $INPUTFILE > $OUTPUTFILE

then launch it in a job (e.g. requesting 6 cores on 1 node for 10 hours - assuming your input file takes advantage of the parallel cores):

(gaia-frontend)$> oarsub -l nodes=1/core=6,walltime=10:00:00 your_matlab_launcher.sh

ABAQUS

Described below are examples of using Abaqus on the clusters, in interactive or passive (batch) mode. In interactive mode the Abaqus/CAE (Complete Abaqus Environment) can be used, while long-running analysis jobs should be run in passive mode.

Note 1: as of November 2014 there is a limited number of licenses available, thus your Abaqus jobs may end with an error mentioning that there were not enough licenses available to complete certain operations. License checking is not yet integrated with the OAR scheduler.

Note 2: if, when running abaqus cae or abaqus viewer, you are receiving an error message mentioning the unavailability of OpenGL, use abaqus cae -mesa and abaqus viewer -mesa instead.

Interactive mode

To run Abaqus/CAE (e.g. on the Gaia cluster) in an interactive session:

  1. On Linux:

     # Connect to Gaia with X11 forwarding enabled:
     (yourmachine)$> ssh access-gaia.uni.lu -X
    
  2. On OS X: Depending on your OS X version you may not have the X Window System installed, and thus will need to install and run XQuartz if the command below returns an ‘X11 forwarding request failed on channel 0’ error:

     # Connect to Gaia with X11 forwarding enabled:
     (yourmachine)$> ssh access-gaia.uni.lu -X
    
  3. On Windows: You will need to install and run VcXsrv, then to configure Putty (Connection -> SSH -> X11 -> Enable X11 forwarding) before logging in to the clusters.

After having connected to the cluster with X11 forwarding:

# Request an interactive job (the default parameters get you 1 core for 2 hours):
(gaia-frontend)$> oarsub -I

# Load the (latest) Abaqus module:
(node)$> module load Abaqus

# Launch Abaqus/CAE, whose full interface will be displayed on your machine (may be slow due to the network):
(node)$> abaqus cae

Batch (passive) mode

For non-interactive or long executions, Abaqus can be ran in passive mode, reading all commands from an input file you provide (e.g. named inputfile.inp).

To enable Abaqus to run in parallel mode, the use of the following launcher launch_abaqus.sh is highly recommended:

#!/bin/bash -l

## Input file, to be set by the user:
INPUT=inputfile.inp

## Automatically set job name and generate environment file for parallel execution
ENVFILE=abaqus_v6.env-OAR-$OAR_JOBID
JOBNAME=job-$(basename $INPUT)
np=$(cat $OAR_NODEFILE | wc -l)

export MPI_REMSH=oarsh
echo "mp_rsh_command='oarsh -x -n -l %U %H %C'" > $ENVFILE

mp_host_list="["
for n in $(sort -u $OAR_NODEFILE); do
	mp_host_list="${mp_host_list}['$n',$(grep -c $n $OAR_NODEFILE)],"
done
echo "mp_host_list=$(echo ${mp_host_list} | sed -e "s/,$/]/")" >> $ENVFILE

ln -fs $ENVFILE abaqus_v6.env

## Abaqus execution
module load Abaqus
abaqus job=$JOBNAME input=$INPUT cpus=$np interactive > $JOBNAME.out 2> $JOBNAME.err

Remember that the launcher needs to be set as executable (chmod +x launch_abaqus.sh) before it can be ran.

The following example shows how to run a parallel (24 cores) Abaqus job for 24 hours in passive mode:

(gaia-frontend)$> cd $directory_with_input_file_and_launcher
(gaia-frontend)$> oarsub -l nodes=2/core=12,walltime=24:00:00 ./launch_abaqus.sh

UL HPC Tutorials

We have written a set of tutorials you might find useful to learn / complete your knowledge of the usage of the UL HPC platform.

Just take a look on Github for the available tutorials. The Readthedocs version is probably more attractive from a pure visual point of view.