HPC @ Uni.lu

High Performance Computing in Luxembourg

Sensitive Data Encryption Using Gocryptfs

Several possibilities for user-level data encryption exist, used to further ensure the security and trust of data stored on the HPC platform. We will focus on tools for (any type of) large-scale data files/folders encryption, while for code and versionable data you can see the sister post on git-crypt.

As of 2018, a comparison of several such tools has been written by the author of Gocryptfs:

  • gocryptfs, aspiring successor of EncFS written in Go
  • EncFS, mature with known security issues
  • eCryptFS, integrated into the Linux kernel
  • Cryptomator, strong cross-platform support through Java and WebDAV
  • securefs, a cross-platform project implemented in C++.
  • CryFS, result of a master thesis at the KIT University that uses chunked storage to obfuscate file sizes.

As a modern implementation of an encryption overlay filesystem, Gocryptfs is the subject of this document.
For more details on its inner workings, see the following::

Note that any use of user-level encryption remains under the responsability of the user, with her/him accepting any inherent risks, such as:

  • loss of access to data due to loss of decryption password/keys
  • data corruption due to encryption store corruption, or improper use of the encryption tools

Ensure you have an off-site backup of critical data stored on the platform under encryption.
(Disaster) recovery of encrypted data is not guaranteed to be viable, depending on internal consistency when the recovery snapshot is taken.

Workflow on the HPC platform

To use gocryptfs on the HPC platform you need:

  1. to load its profile from the modules system
  2. to create two folders:
    • one which will act as the storage for the encrypted files (let’s call it crypt)
    • the other will present (on demand) the unencrypted view (let’s call it view)
  3. to initialize the crypt folder with a password
  4. to mount the crypt folder into the view folder
  5. all your processing (new file/folder creation, modification and transfers) will happen in the view folder
  6. to unmount the view folder such that the unencrypted view of your data is closed and data is flushed to the regular filesystem

For example, to create a new encrypted store in a folder named dir.crypt , open the unencrypted view in a folder named dir, create a test file inside and close the view:

module load tools/gocryptfs
mkdir dir.crypt dir
gocryptfs -init dir.crypt
gocryptfs dir.crypt dir
echo "Happy secure computing!" > dir/message.txt
fusermount -u dir

Let’s see the transcript of these operations as performed on the Iris cluster:

    $ srun -N 1 -n 1 -p interactive --time=0:10:0 --pty bash -i
    $ module load tools/gocryptfs
    $ mkdir dir.crypt dir
    $ gocryptfs -init dir.crypt
    Choose a password for protecting your files.
    Your master key is:
    If the gocryptfs.conf file becomes corrupted or you ever forget your password,
    there is only one hope for recovery: The master key. Print it to a piece of
    paper and store it in a drawer. This message is only printed once.
    The gocryptfs filesystem has been created successfully.
    You can now mount it using: gocryptfs dir.crypt MOUNTPOINT
    $ ls dir.crypt/
    gocryptfs.conf  gocryptfs.diriv
    $ ls dir
    $ gocryptfs dir.crypt dir
    Decrypting master key
    Filesystem mounted and ready.
    $ ls dir
    $ echo "Happy secure computing" > dir/message.txt
    $ ls dir
    $ ls dir.crypt/
    5o_WSYN-Tn59W3vrPiHXEA  gocryptfs.conf  gocryptfs.diriv
    $ fusermount -u dir
    $ ls dir
    $ ls dir.crypt/
    5o_WSYN-Tn59W3vrPiHXEA  gocryptfs.conf  gocryptfs.diriv
    $ gocryptfs -masterkey e1ecdcf1-6bcebaa0-cbe6cfb8-8e27d4ad-acefb9d4-bd98de59-311d1898-31d7e4e4 dir.crypt/ dir
    Using explicit master key.
    Filesystem mounted and ready.
    $ ls dir
    $ cat dir/message.txt 
    Happy secure computing
    $ ls -l dir/message.txt 
    -rw-r--r-- 1 vplugaru clusterusers 23 Dec 14 15:14 dir/message.txt
    $ ls -l dir.crypt/5o_WSYN-Tn59W3vrPiHXEA 
    -rw-r--r-- 1 vplugaru clusterusers 73 Dec 14 15:14 dir.crypt/5o_WSYN-Tn59W3vrPiHXEA
    $ chmod o-r dir/message.txt 
    $ ls -l dir/message.txt 
    -rw-r----- 1 vplugaru clusterusers 23 Dec 14 15:14 dir/message.txt
    $ ls -l dir.crypt/5o_WSYN-Tn59W3vrPiHXEA 
    -rw-r----- 1 vplugaru clusterusers 73 Dec 14 15:14 dir.crypt/5o_WSYN-Tn59W3vrPiHXEA
    $ fusermount -u dir

Several important elements can be seen in the above transcript:

  1. On crypt store initialization, gocryptfs provides us with the master key that can be used to restore access to the data files, especially useful in case the password is lost
    • you should keep the master key safe, never store it unencrypted on the platform itself
  2. After initialization, the crypt store contains two internal configuration files: gocryptfs.conf is the global configuration for the crypt store, while gocryptfs.diriv is created per-directory for encryption of file names
    • note that you should never modify (any) files within the crypt store
  3. To be able to access/store data, the crypt store needs to be mounted in the view folder
    • this can be done by supplying the initially set password, either on the command line or from a file with -passfile option
    • … or with the generated master key, with the -masterkey option
    • with the passfile option, it means that you have stored your password unencrypted on the filesystem - this is then a security risk!
    • when using the master key mode, you should be in a full-node or exclusive job reservation such that there are no other users able to see the master key in the system
  4. Once the crypt store is mounted in the view directory we can create files in the latter:
    • any folder/file created in the unencrypted view will have a 1:1 correspondent in the crypt store
    • the plain text message.txt file is stored in encrypted format as 5o_WSYN-Tn59W3vrPiHXEA in the underlying crypt store (file name metadata is encrypted as well)
    • the same permissions applied on message.txt are also set for its encrypted correspondent file
  5. At the end of our processing, we are using fusermount explicitly to unmount the encrypted overlay
    • note that you should always ensure that this happens before your job reservation expires

Other important details:

  • On the Iris cluster the mounted encryption overlay is tied to the job context, when the job ends the overlay is destroyed;
  • On the Gaia cluster the mounted overlay is not closed on job exit except explicitly using fusermount;
  • Data stored in a crypt store should not be used concurrently (e.g. by multiple users at the same time)
    • the special option -sharedstorage exists for this use-case, but is not guaranteed to work for all applications;
  • (Parallel) Applications ran through srun on the Iris cluster cannot ‘see’ the unencrypted view folder as they are run in a different context;
    • this is also the case if you use sjoin or srun --jobid to attach your terminal to a running job;

Gocryptfs store password management

  • You can change the password of an existing crypt store with the -passwd option:

      gocryptfs -passwd dir.crypt/
      Password: [your current password here]
      Decrypting master key
      Please enter your new password.
      Password: [your new password here]
      Repeat: [your new password here]
      Password changed.

Note that the master key does not change.

  • For running batch processing on a gocryptfs-based, you can provide the decryption password through an external application with the -extpass option:

      gocryptfs -extpass "echo foobar" dir.crypt dir
      Reading password from extpass program
      Decrypting master key
      Filesystem mounted and ready.

Note that this means that another application stores/has access to the password - this is then a security risk!

Gocryptfs performance

This section shows indicative performance results when using Gocryptfs on the HPC platform.

Gocrypfs’s internal performance benchmark selects Golang’s AES 256 implementation as providing the best speed results:

  1. Single-test CPU results on Broadwell (iris-001) and Skylake (iris-150) generation nodes:

     $ srun -w iris-001 -N 1 -n 1 gocryptfs -speed
     AES-GCM-256-OpenSSL      641.79 MB/s
     AES-GCM-256-Go          1246.23 MB/s    (selected in auto mode)
     AES-SIV-512-Go            90.42 MB/s
     $ srun -w iris-150 -N 1 -n 1 gocryptfs -speed
     AES-GCM-256-OpenSSL      692.93 MB/s
     AES-GCM-256-Go          1549.04 MB/s    (selected in auto mode)
     AES-SIV-512-Go           174.65 MB/s
  2. Median of 99 CPU benchmarks:
    • on Broadwell nodes: AES-GCM-256-Go 1374.14 MB/s (selected in auto mode)
    • on Skylake nodes: AES-GCM-256-Go 1846.93 MB/s (selected in auto mode)
  3. Canonical Gocryptfs benchmarks as performed on a Skylake node, comprising:
    • streaming read/writes (on 26GB file, up from default of 256MB file)
    • extracting a linux kernel tarball, hashing its files, recursively listing and deleting it
Encryption Backend filesystem WRITE READ UNTAR MD5 LS RM
[None] Lustre ($SCRATCH) 1.2 GB/s 2.4 GB/s 22.224 8.017 2.015 7.114
[None] GPFS ($HOME/$PROJECT) 3.7 GB/s 6.3 GB/s 20.327 71.526 4.900 15.404
[gocryptfs] Lustre ($SCRATCH) 476 MB/s 589 MB/s 58.143 11.919 4.637 18.339
[gocryptfs] GPFS ($HOME/$PROJECT) 717 MB/s 888 MB/s 55.151 69.134 8.469 26.057