Several possibilities for user-level data encryption exist, used to further ensure the security and trust of data stored on the HPC platform. We will focus on tools for (any type of) large-scale data files/folders encryption, while for code and versionable data you can see the sister post on git-crypt.
As of 2018, a comparison of several such tools has been written by the author of Gocryptfs:
- gocryptfs, aspiring successor of EncFS written in Go
- EncFS, mature with known security issues
- eCryptFS, integrated into the Linux kernel
- Cryptomator, strong cross-platform support through Java and WebDAV
- securefs, a cross-platform project implemented in C++.
- CryFS, result of a master thesis at the KIT University that uses chunked storage to obfuscate file sizes.
As a modern implementation of an encryption overlay filesystem, Gocryptfs is the subject of this document.
For more details on its inner workings, see the following::
- Security design documentation
- Threat model
- 2017 security audit, Audit report as PDF
- Gocryptfs source code
Note that any use of user-level encryption remains under the responsability of the user, with her/him accepting any inherent risks, such as:
- loss of access to data due to loss of decryption password/keys
- data corruption due to encryption store corruption, or improper use of the encryption tools
Ensure you have an off-site backup of critical data stored on the platform under encryption.
(Disaster) recovery of encrypted data is not guaranteed to be viable, depending on internal consistency when the recovery snapshot is taken.
Workflow on the HPC platform
To use gocryptfs on the HPC platform you need:
- to load its profile from the modules system
- to create two folders:
- one which will act as the storage for the encrypted files (let’s call it crypt)
- the other will present (on demand) the unencrypted view (let’s call it view)
- to initialize the crypt folder with a password
- to mount the crypt folder into the view folder
- all your processing (new file/folder creation, modification and transfers) will happen in the view folder
- to unmount the view folder such that the unencrypted view of your data is closed and data is flushed to the regular filesystem
For example, to create a new encrypted store in a folder named dir.crypt , open the unencrypted view in a folder named dir, create a test file inside and close the view:
1 2 3 4 5 6 | |
Let’s see the transcript of these operations as performed on the Iris cluster:
$ srun -N 1 -n 1 -p interactive --time=0:10:0 --pty bash -i
$ module load tools/gocryptfs
$ mkdir dir.crypt dir
$ gocryptfs -init dir.crypt
Choose a password for protecting your files.
Password:
Repeat:
Your master key is:
e1ecdcf1-6bcebaa0-cbe6cfb8-8e27d4ad-
acefb9d4-bd98de59-311d1898-31d7e4e4
If the gocryptfs.conf file becomes corrupted or you ever forget your password,
there is only one hope for recovery: The master key. Print it to a piece of
paper and store it in a drawer. This message is only printed once.
The gocryptfs filesystem has been created successfully.
You can now mount it using: gocryptfs dir.crypt MOUNTPOINT
$ ls dir.crypt/
gocryptfs.conf gocryptfs.diriv
$ ls dir
$ gocryptfs dir.crypt dir
Password:
Decrypting master key
Filesystem mounted and ready.
$ ls dir
$ echo "Happy secure computing" > dir/message.txt
$ ls dir
message.txt
$ ls dir.crypt/
5o_WSYN-Tn59W3vrPiHXEA gocryptfs.conf gocryptfs.diriv
$ fusermount -u dir
$ ls dir
$ ls dir.crypt/
5o_WSYN-Tn59W3vrPiHXEA gocryptfs.conf gocryptfs.diriv
$ gocryptfs -masterkey e1ecdcf1-6bcebaa0-cbe6cfb8-8e27d4ad-acefb9d4-bd98de59-311d1898-31d7e4e4 dir.crypt/ dir
Using explicit master key.
THE MASTER KEY IS VISIBLE VIA "ps ax" AND MAY BE STORED IN YOUR SHELL HISTORY!
ONLY USE THIS MODE FOR EMERGENCIES
Filesystem mounted and ready.
$ ls dir
message.txt
$ cat dir/message.txt
Happy secure computing
$ ls -l dir/message.txt
-rw-r--r-- 1 vplugaru clusterusers 23 Dec 14 15:14 dir/message.txt
$ ls -l dir.crypt/5o_WSYN-Tn59W3vrPiHXEA
-rw-r--r-- 1 vplugaru clusterusers 73 Dec 14 15:14 dir.crypt/5o_WSYN-Tn59W3vrPiHXEA
$ chmod o-r dir/message.txt
$ ls -l dir/message.txt
-rw-r----- 1 vplugaru clusterusers 23 Dec 14 15:14 dir/message.txt
$ ls -l dir.crypt/5o_WSYN-Tn59W3vrPiHXEA
-rw-r----- 1 vplugaru clusterusers 73 Dec 14 15:14 dir.crypt/5o_WSYN-Tn59W3vrPiHXEA
$ fusermount -u dir
Several important elements can be seen in the above transcript:
- On crypt store initialization, gocryptfs provides us with the master key that can be used to restore access to the data files, especially useful in case the password is lost
- you should keep the master key safe, never store it unencrypted on the platform itself
- After initialization, the crypt store contains two internal configuration files:
gocryptfs.confis the global configuration for the crypt store, whilegocryptfs.dirivis created per-directory for encryption of file names- note that you should never modify (any) files within the crypt store
- To be able to access/store data, the crypt store needs to be mounted in the view folder
- this can be done by supplying the initially set password, either on the command line or from a file with
-passfileoption - … or with the generated master key, with the
-masterkeyoption - with the passfile option, it means that you have stored your password unencrypted on the filesystem - this is then a security risk!
- when using the master key mode, you should be in a full-node or exclusive job reservation such that there are no other users able to see the master key in the system
- this can be done by supplying the initially set password, either on the command line or from a file with
- Once the crypt store is mounted in the view directory we can create files in the latter:
- any folder/file created in the unencrypted view will have a 1:1 correspondent in the crypt store
- the plain text
message.txtfile is stored in encrypted format as5o_WSYN-Tn59W3vrPiHXEAin the underlying crypt store (file name metadata is encrypted as well) - the same permissions applied on
message.txtare also set for its encrypted correspondent file
- At the end of our processing, we are using
fusermountexplicitly to unmount the encrypted overlay- note that you should always ensure that this happens before your job reservation expires
Other important details:
- On the Iris cluster the mounted encryption overlay is tied to the job context, when the job ends the overlay is destroyed;
- On the Gaia cluster the mounted overlay is not closed on job exit except explicitly using
fusermount; - Data stored in a crypt store should not be used concurrently (e.g. by multiple users at the same time)
- the special option
-sharedstorageexists for this use-case, but is not guaranteed to work for all applications;
- the special option
- (Parallel) Applications ran through
srunon the Iris cluster cannot ‘see’ the unencrypted view folder as they are run in a different context;- this is also the case if you use
sjoinorsrun --jobidto attach your terminal to a running job;
- this is also the case if you use
Gocryptfs store password management
-
You can change the password of an existing crypt store with the
-passwdoption:gocryptfs -passwd dir.crypt/ Password: [your current password here] Decrypting master key Please enter your new password. Password: [your new password here] Repeat: [your new password here] Password changed.
Note that the master key does not change.
-
For running batch processing on a gocryptfs-based, you can provide the decryption password through an external application with the
-extpassoption:gocryptfs -extpass "echo foobar" dir.crypt dir Reading password from extpass program Decrypting master key Filesystem mounted and ready.
Note that this means that another application stores/has access to the password - this is then a security risk!
Gocryptfs performance
This section shows indicative performance results when using Gocryptfs on the HPC platform.
Gocrypfs’s internal performance benchmark selects Golang’s AES 256 implementation as providing the best speed results:
-
Single-test CPU results on Broadwell (
iris-001) and Skylake (iris-150) generation nodes:$ srun -w iris-001 -N 1 -n 1 gocryptfs -speed AES-GCM-256-OpenSSL 641.79 MB/s AES-GCM-256-Go 1246.23 MB/s (selected in auto mode) AES-SIV-512-Go 90.42 MB/s $ srun -w iris-150 -N 1 -n 1 gocryptfs -speed AES-GCM-256-OpenSSL 692.93 MB/s AES-GCM-256-Go 1549.04 MB/s (selected in auto mode) AES-SIV-512-Go 174.65 MB/s - Median of 99 CPU benchmarks:
- on Broadwell nodes:
AES-GCM-256-Go 1374.14 MB/s (selected in auto mode) - on Skylake nodes:
AES-GCM-256-Go 1846.93 MB/s (selected in auto mode)
- on Broadwell nodes:
- Canonical Gocryptfs benchmarks as performed on a Skylake node, comprising:
- streaming read/writes (on 26GB file, up from default of 256MB file)
- extracting a linux kernel tarball, hashing its files, recursively listing and deleting it
| Encryption | Backend filesystem | WRITE | READ | UNTAR | MD5 | LS | RM |
|---|---|---|---|---|---|---|---|
| [None] | Lustre ($SCRATCH) | 1.2 GB/s | 2.4 GB/s | 22.224 | 8.017 | 2.015 | 7.114 |
| [None] | GPFS ($HOME/$PROJECT) | 3.7 GB/s | 6.3 GB/s | 20.327 | 71.526 | 4.900 | 15.404 |
| [gocryptfs] | Lustre ($SCRATCH) | 476 MB/s | 589 MB/s | 58.143 | 11.919 | 4.637 | 18.339 |
| [gocryptfs] | GPFS ($HOME/$PROJECT) | 717 MB/s | 888 MB/s | 55.151 | 69.134 | 8.469 | 26.057 |
