Logo

HPC @ Uni.lu

High Performance Computing in Luxembourg

Timeline:

This is the timeline for the data migration plan of Chaos/Gaia cluster to Iris.

Due time Action
February List and contact PI + schedule meeting
February/March Research team physical meetings
August Step 3: compute offline, select systems given back to research groups, HPC user data cleaning
October Step 4: storage in read-only (only available on Gaia login nodes) to GPFS Gaia, last transfer done by HPC staff
December Step 7: shutdown

Steps for migration: Here is the procedure, guidelines, with the different steps to follow to migrate to Iris

STEP1 Documentation and migration plan

This step concern ULHPC Team. It consist on defining policies, migration plan, writing documentation for users.

STEP2 Warn users, validate migration plan

Involving the users, ULHPC Team will contact everybody and all the project owners to warn encourage them to migrate their personal data, to identify which project needs to be archived and which needs to be migrated, find people responsible for each project and sign an agreement. At this step, we will also list the special requests and identify users or projects that need more time for migration to Iris.
  • Warn users
    • Regular users 6 months in advance by email
    • Warn the PI and advisor by personal emails
    • Phone call or regular meeting with teams to discuss decommission and target special requests (additional time for compute)
      • list groups + PI contact + initiate individual team meetings over February/March
      • Read-only data access will always be ensured
  • Define with the users what should be done with the existing data, when, and get their agreement
  • Define what needs to be archived
  • Define what needs to be migrated to Iris / Isilon
  • Define who is responsible for each projects, and ask the actions to be taken
    • especially once hpc-sysadmins have finished the transfer: feedback to avoid inconsistency
  • Archive homedirs, $HOME, no time limit, after STEP4, only accessed by ULHPC
  • Archive workdirs, $WORK, to be removed in X years, after STEP4, only accessed by ULHPC
  • Do NOT archive scratch, $SCRATCH (content will be removed)
  • Define what should be done with remaining data if no instructions is given by a owner/user
  • Define the deadlines for the following steps

STEP3 No more computing except special requests August

At this step, no more computation will be allowed on Gaia/Chaos, except for users that have asked a special request to continue computation.
  • Cut the access to the nodes for regular users (no more submission in the queues)
    • Disable regular submission to the cluster by adding admission_rules to allow only some users to connect
    • Set unused resources to DEAD state to only allow submission on a limited amount of nodes

STEP4 Read-only filesystem except special requests October

All filesystems will be put in Read-Only mode, which means that no data can be written, deleted or moved anymore on the Gaia filesystem. Access to the data for reading or transferring files to Iris will be still possible. This doesn’t concern special requests.
  • Put all the projects and homedirs in read-only except for users who are still submitting jobs
    • No more writing in homedir $HOME
    • No more writing on projects
    • No more writing on scratch $SCRATCH

STEP5 No more computing for all the users

At this step, no more computation will be allowed on Gaia/Chaos.
  • Cut the access to everybody for computing (no more computing nodes)
    • No more users allowed to submit jobs
    • Inventory of hardware state of the resources (what is broken or not)
    • Every resource set to DEAD on OAR database
    • Remove physically resources
    • Migrate hardware that has to be migrated to CDC

STEP6 No more writing for anybody

All filesystem will be put in Read-Only mode, which means that no data can be written, deleted or moved anymore on the Gaia filesystem. Access to the data for reading or transferring files to Iris will be still possible.
  • Remove write rights for everybody on all storage

STEP7 No more access to file storage end mid-Dec 2019

Non return point, access to the data on Gaia/Chaos will be cut and no more file transfer is possible.
  • Cut the access to access nodes (no more file transfer possible)
    • Change security access to allow only sysadmins to connect

STEP8 Recycle the hardware

This part involve ULHPC team and owner of hardware on Gaia or Iris clusters to physically remove hardware equipment or migrate it to CDC.
  • Define which nodes needs to be kept alive

See decommissioning process guide for reference