Logo

HPC @ Uni.lu

High Performance Computing in Luxembourg

FAQ: How Can I Use Parallel Launchers on Reserved Resources?

Taktuk is a tool for deploying parallel remote executions of commands to a potentially large set of remote nodes. It spreads itself using an adaptive algorithm and sets up an interconnection network to transport commands and perform I/Os multiplexing/demultiplexing. The Taktuk mechanics dynamically adapt to environment (machine performance and current load, network contention) by using a reactive work-stealing algorithm that mixes local parallelization and work distribution. While covering the full use of Taktuk is outside the scope of this FAQ post (please check the man page or the User guide, below is an exemple of launching the command hostname on multiple reserved nodes:

[16:49:44] login@k-cluster1-13 ~>taktuk -c "oarsh" -f $OAR_NODEFILE broadcast exec [ hostname ]
k-cluster1-14-3: hostname (13115): output > k-cluster1-14
k-cluster1-13-2: hostname (13170): output > k-cluster1-13
k-cluster1-14-3: hostname (13115): status > Exited with status 0
k-cluster1-13-2: hostname (13170): status > Exited with status 0
k-cluster1-14-4: hostname (13116): output > k-cluster1-14
k-cluster1-16-8: hostname (14772): output > k-cluster1-16
k-cluster1-13-1: hostname (13171): output > k-cluster1-13
k-cluster1-15-5: hostname (13460): output > k-cluster1-15
k-cluster1-15-6: hostname (13459): output > k-cluster1-15
k-cluster1-16-7: hostname (14773): output > k-cluster1-16
k-cluster1-16-7: hostname (14773): status > Exited with status 0
k-cluster1-13-1: hostname (13171): status > Exited with status 0
k-cluster1-14-4: hostname (13116): status > Exited with status 0
k-cluster1-16-8: hostname (14772): status > Exited with status 0
k-cluster1-15-5: hostname (13460): status > Exited with status 0
k-cluster1-15-6: hostname (13459): status > Exited with status 0

You can avoid the specification of the connector (through option -c) by setting the environment variable TAKTUK_CONNECTOR. Thus, you can add in your .bashrc file the command:

export TAKTUK_CONNECTOR=oarsh

You may find taktuk difficult to use (and you’re probably right). In that case, Kanif is your friend.

For simple parallel tasks that have to be executed on regular machines such as clusters, Taktuk syntax is too complicated. The goal of Kanif is to provide an easier and more familiar syntax to cluster administrators or users while still taking advantage of Taktuk characteristics and features (adaptivity, scalability, portability, autopropagation and information redirection). Kanif suite comes with three commands:

  • kash to run the same command on multiple nodes
  • kaput to broadcast the copy of files or directories to several nodes
  • kaget to gather several remote files or directories Read the man page for more information. Just as an illustration, here is how to run the hostname command on reserved nodes, assuming the TAKTUK_CONNECTOR variable is exported as explained before:

    [16:49:44] login@k-cluster1-13 ~>kash -M $OAR_NODEFILE -- hostname
    --------------------------------------------------------------------------------
    STDOUT
    --------------------------------------------------------------------------------
    --------------------------------------------------------------------------------
    k-cluster1-16 (2 HOSTS)
    --------------------------------------------------------------------------------
    k-cluster1-16
    --------------------------------------------------------------------------------
    k-cluster1-13 (2 HOSTS)
    --------------------------------------------------------------------------------
    k-cluster1-13
    --------------------------------------------------------------------------------
    k-cluster1-14 (2 HOSTS)
    --------------------------------------------------------------------------------
    k-cluster1-14
    --------------------------------------------------------------------------------
    k-cluster1-15 (2 HOSTS)
    --------------------------------------------------------------------------------
    k-cluster1-15