Logo

HPC @ Uni.lu

High Performance Computing in Luxembourg

Your usage of the UL HPC platform will involve running applications on the compute nodes. Please refers to Getting started for a more detailed guide.

This guide assumes that you:

  • have an account with an SSH public key (you must connect with public key authentication). Go to Get an account page to request an account.
  • have configured an SSH client on your machine. Go to Access to HPC cluster to know how to do this.

Access to HPC platform

We will access chaos-cluster. You can replace chaos with gaia as the cluster to connect to.

Linux/BSD/ macOS

  • Open your terminal application
  • At the prompt, type:
1
ssh -p 8022 yourlogin@access-chaos.uni.lu
  • You should be connected to chaos

Windows

We will show you an example using PuttY. Please adapt it to your needs.

  • Open your SSH client application.
  • Enter this settings:
    • In Category:Session :
      • Host Name: access-chaos.uni.lu
      • Port: 8022
      • Connection Type: SSH (leave as default)
    • In Category:Connection:Data :
      • Auto-login username: yourlogin
  • Click on Open button
  • You should be connected to chaos-cluster.

Results

At the end of this step, the following welcome banner should appears:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Last login: Tue Feb 21 14:25:29 2017 from 10.91.100.142
======================================================================================
 Welcome to access-chaos.uni.lu                        HPC @ Uni.lu: http://hpc.uni.lu
======================================================================================
               _   _                    _                         
              | | | |___  ___ _ __     / \   ___ ___ ___  ___ ___
              | | | / __|/ _ \ '__|   / _ \ / __/ __/ _ \/ __/ __|
              | |_| \__ \  __/ |     / ___ \ (_| (_|  __/\__ \__ \
               \___/|___/\___|_|    /_/   \_\___\___\___||___/___/
         ______ _                            _           _          __  
        / / ___| |__   __ _  ___  ___    ___| |_   _ ___| |_ ___ _ _\ \
       | | |   | '_ \ / _` |/ _ \/ __|  / __| | | | / __| __/ _ \ '__| |
       | | |___| | | | (_| | (_) \__ \ | (__| | |_| \__ \ ||  __/ |  | |
       | |\____|_| |_|\__,_|\___/|___/  \___|_|\__,_|___/\__\___|_|  | |
        \_\                                                         /_/

=== Computing Nodes ================================================= #RAM/n == #C  ==
 h-cluster1-[1-32]  32 HP BL2x220c G6(2 Xeon L5640@2.26 GHz [6c/60W])    24G    384
 d-cluster1-[1-16]  16 Dell PE M610  (2 Xeon L5640@2.26 GHz [6c/60W])    48G    192
 r-cluster1-1       1  Dell R910     (4 Xeon X7560@2.26GHz  [8c/130W]) 1024G    32
 e-cluster1-[1-16]  16 DELL PE M620  (2 Xeon E5-2660@2.2GHz [8c/95W])    32G    256
 s-cluster1-[1-16]  16 HP SL230S     (2 Xeon E5-2660@2.2GHz [8c/95W])    32G    256
======================================================================================
				       *** TOTAL: 81 nodes, 1120 computing cores ***
 DEPRECATED node
   - k-cluster1-[1-16]: 16 Dell PE 850  (1 Pentium D  @ 3.2 GHz, 4GB RAM)....32  cores
   - b-cluster1-[1-2] : 2  Dell PE 6850 (4 Xeon @ 3.4 GHz, 32 GB RAM)........16  cores

 Fast interconnect using InfiniBand QDR 40 Gb/s technology
 Shared Storage (raw capacity): 180 TB (NFS)

 Support (in this order!)			Platform notifications
   - User DOC ........ http://hpc.uni.lu/docs	 - hpc-platform@uni.lu
   - Mailing-list .... hpc-users@uni.lu          - Twitter: @ULHPC
   - Bug reports ..... hpc-tracker.uni.lu
   - Admins .......... hpc-sysadmins@uni.lu
===============================================================================
 /!\ NEVER COMPILE OR RUN YOUR PROGRAMS FROM THIS FRONTEND !
     First reserve your nodes (using oarsub(1))
Linux access.chaos-cluster.uni.lux 3.2.0-4-amd64 unknown
 17:20:04 up 126 days,  5:33,  9 users,  load average: 0.12, 0.08, 0.07
0 [17:20:04] yourlogin@access(chaos-cluster) ~>
If you have troubles accessing the cluster, please have a look at the detailled guide here

Reserve a core

Kindly note that you should never run compute/memory/disk intensive applications on a cluster’s frontend.

On chaos-access reserve a 1 core with OAR.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0 [17:20:04] yourlogin@access(chaos-cluster) ~> oarsub -I
[ADMISSION RULE] Set default walltime to 7200.
[ADMISSION RULE] Modify resource description with type constraints
OAR_JOB_ID=1553796
Interactive mode : waiting...
Starting...

Connect to OAR job 1553796 via the node d-cluster1-10
[OAR] OAR_JOB_ID=1553796
[OAR] Your nodes are:
      d-cluster1-10*1

Linux d-cluster1-10 3.2.0-4-amd64 unknown
 17:23:21 up 123 days,  3:06,  1 user,  load average: 5.72, 6.02, 6.34
0 [17:23:21] yourlogin@d-cluster1-10(chaos-cluster) ~>

We are now connected on a compute node and your tasks are restricted to executing on the reserved core.

If you have trouble reserving a node, please have a look at the detailed scheduler guide here.

Use module to load application and its dependencies

Have a look to the application we provide. Load the appropriate module (for example Matlab):

1
2
3
4
5
0 [17:27:09] yourlogin@d-cluster1-10(chaos-cluster) ~> module load base/MATLAB
0 [17:27:26] yourlogin@d-cluster1-10(chaos-cluster) ~> module list

Currently Loaded Modules:
  1) base/MATLAB/2014a
If you have troubles using module, please have a look at the detailled guide here

Run your application

Run MATLAB:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
0 [17:27:30] yourlogin@d-cluster1-10(chaos-cluster) ~> matlab -nodisplay
Opening log file:  /home/users/cparisot/java.log.13504

                                               < M A T L A B (R) >
                                     Copyright 1984-2014 The MathWorks, Inc.
                                       R2014a (8.3.0.532) 64-bit (glnxa64)
                                                February 11, 2014


To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.

>> 1+1

ans =

     2

>>
If you have troubles running your application or you want to run your own application, please have a look at the detailled guide compiling guide or installing a software guide.

Exit your application and job

Exit MATLAB (exit command) and exit your shell with Ctrl-<D>.

Congratulations, you have run your first job on chaos cluster !

What’s next ?

Tutorials

We are continuously enhancing a set of tutorials (sources). Give them a try!

Getting started

Have a look at the Getting started page to know how to use the UL HPC platform.

TOP10 Best Practices

Using scalable HPC systems in harmony (for Uni.Lu and not only):

Fundamentals

  • Be a nice HPC-citizen: respect the defined Acceptable Use Policy & do report identified and reproducible issues via the ticketing system, at the earliest convenience.
  • Read documentation thoroughly and first try to verify the known path; reuse existing (and tested) launch-scripts mechanisms for job submission in the queueing system.
  • Read about and apply standard HPC techniques & practices, as visible in the training material of HPC sites, eg. of NCSA CyberIntegrator (at least, check the content index - it will certainly become useful in the future)
  • Ensure proper disk sizing/backup/redundancy level for your application situation; declare a project if your needs are special and require some kind of attention or, special allocation. Allocation is always conditional on resources availability and may imply for you some costs handling, if your needs are too special.
  • Consider sysadmin time planning: realize that all incoming issues have to be prioritized according to user community impact. Use ticketing.

Nice to have

  • Reuse existing optimized libraries and applications wherever possible (fi. modules: MPI, compilers, libraries)
  • Make your scripts generic (respect any defined Directory Structure and apply staging techniques, where needed); Use variable aliasing - no hardcoding of full path names; remember that any HPC system may be modified, upgraded or simply replaced before your project finishes.
  • Take advantage of modules, to manage multiple versions of software, even for own usage.
  • Take advantage of EasyBuild, to manage organizing software from multiple sources; either for own software or 3rd-party. This is especially important with code expected to run across multiple architectures and rebuilt in multiple contexts.
  • Identify the policy class your tasks belong to and try to make the most efficient work out of your allocation; avoid underutilization of an allocation, this will harm other users because it increases queueing; monitor your jobs via ganglia plots for both chaos & gaia.

Hints & Tips

Make your life easier

  • Do code versioning for the sources or scripts you develop (ref: github/gforge); fi. do you have a history of all last month’s revisions? What happens if you inadvertently overwrite a 20KB source file right before a paper submission deadline?
  • Do some form of checkpointing if your individual jobs run for more than 1 day; the advantages you get out of it are plenty and it is a major aspect of code quality; see checkpointing info online and remember that OAR can send a signal to checkpoint your job before it arrives to walltime termination.
  • Keep a standard eg. “Hello World” example ready, in case you need to do differential debugging on a suspected system problem. Use it as a reference in your ticket, if you spot problems with it; it helps communication to remain relevant and effective. More generally, when you report a bug of a complex software tree, reduce it to the essential.
  • Avoid looking for hacks to overcome existing policies; rather document your need and the rational behind it and propose it as a “project”; it makes more sense for everybody, really
  • Take advantage of GPU technology or other architectures if applicable in your case; be careful with the GPU vs cores speedup ratios (it is always welcome to receive such user reports and you are encouraged to share the results in hpc-users list, even if they are not favourable)
  • If you have a massive workflow of jobs to manage, do not reinvent the wheel: contact the sysadmins and other fellow users (hpc-users list) to poll for advice on your approach & collect ideas
  • Report any plans to scale within HPC systems in any non-trivial way, as early as possible; it helps both sides to prepare nicely and avoids frustration
  • Unless you have own reasons, opt for a scripting language for your code integration but, faster optimized language for the “application kernel” (in order to obtain both of maintainability & performance!). Many computational kernels are readily usable from within scripting languages (examples: NumPy, Scipy).
  • If you have deadlines to adhere to, kindly notify about it early on; you may not be alone; the sysadmins team serve in best effort yet will try to keep user needs satisfied, as possible, with the proviso that not all requests may be able to fulfill.
  • If you find techniques that you consider elegant and relevant to other users’ work, you are automatically welcome to report to HPC users’ mailing list!

Report a problem