Logo

HPC @ Uni.lu

High Performance Computing in Luxembourg

Overview

The cluster is organized as follows (click to enlarge):

Download the PDF

Overview of the Chaos cluster

It is composed of the following computing elements:

Thus, the computing nodes of this cluster are quite heterogeneous yet they share the same processor architecture (Intel 64 bit) meaning that a code compiled on one of the nodes could work on all the others, unless it uses special features such as AVX commandset etc.

Important Sandybridge processors (node classes ‘e’, ‘s’) should carry on 8 ops/cycle and support the AVX command set.

The previous generations of processors (e.g. Westmere) only support 4 ops/cycle

Below you’ll see a picture of one of the racks hosting chaos cluster components.

Interconnect

The interconnect is composed of an Infiniband QDR (40Gb/s) network.

The choice of topology is imposed by the heterogeneous nature of Chaos, and by the fact that the hardware is split across 2 server rooms.

The following schema describes the topology of the Chaos Infiniband Network.

Additionally, the cluster is connected to the infrastructure of the University using 10Gb Ethernet.

A third 1Gb Ethernet network is also used on the cluster, mainly for services and administration purposes.

Storage / Cluster File System

Total Raw Capacity:
  • $HOME & $WORK under NFS (xfs over LVM): TB
  • Backup Capacity: TB

In terms of storage, a dedicated NFS server is responsible for sharing specific folders (most importantly, users home directories) across the nodes of the clusters.

The hardware part is composed of a Netapp E5400 disk enclosure, containing 60 disks (3TB SAS 7.2krpm). The raw capacity is 180 TB, and is split in 5 x raid 6 of 10 disks (8+2), 10 other disks are used as spare.

An additional storage device (of the same capacity) is used as backup target. The filesystem is XFS over LVM (Logical Volume Manager)

The current effective shared storage capacity of the NFS on the Chaos cluster is estimated at 110 TB

History

The Chaos cluster exists since 2007 to serve the computing needs of the University of Luxembourg.

The platform has evolved since 2007 as follows:

  • 2007: Initialization of the cluster composed by 1 frontend, 1 NFS server (net capacity: 3TB) and 18 computing nodes, divided in two classes:

    • k-cluster1-[1-16]: Dell PE850 (1U) (1 Pentium D @ 3.2 GHz, 4GB RAM). Total: 32 computing cores, 410 GFlops
    • b-cluster1-[1-2]: Dell PE6850 (4U) (4 Dual Core Xeon @ 3.4 GHz, 32 GB RAM). Total: 16 computing cores, 218 GFlops
  • 2010: Extension with 1 HP blade enclosure (10U);

    • h-cluster1-[1-32]: HP Proliant BL2x220c G6 (2 Xeon Westmere L5640 @ 2.26 GHz, 24GB RAM) for a total of 384 cores (RPeak = 3,472 TFlops)
  • 2011: Storage and computing capacity extension

    • Increased storage capacity with an upgrade of the disks in the storage bay. Total Capacity of 21.83 TB.
    • d-cluster1-[1-16]: Dell M610 (2 Xeon Westmere L5640 @ 2.26 GHz, 24GB RAM) for a total of 176 cores (RPeak = 1,736 TFlops)
  • 2012: Storage, computing capacity and interconnect extension

    • Increased storage capacity with a new disk enclosure and a new NFS server. Total Capacity raised now to 110 TB.
    • e-cluster1-[1-16]: Dell M620 (2 Xeon Sandy-Bridge E5-2660 @ 2.20GHz, 32GB RAM) for a total of 256 cores
    • s-cluster1-[1-16]: HP SL230S (2 Xeon Sandy-Bridge E5-2660 @ 2.20GHz, 32GB RAM) for a total of 256 cores
    • Fast infiniband QDR interconnect (Mellanox based)
    • Old ‘k’ and ‘b’ class nodes decommissioned
  • 2014: Memory upgrade from 24GB to 48GB on d-cluster1-[1-16]