Logo

HPC @ Uni.lu

High Performance Computing in Luxembourg

Using Git-crypt to Protect Sensitive Data

The advent of the EU General Data Protection Regulation (GDPR) permitted to highlight the need to protect sensitive information from leakage.

It is of course even more important in the context of git repositories, whether public or private, since the disposal of a working copy of the repository enable the access to the full history of commits, in particular the ones eventually done by mistake (git commit -a) that used to include sensitive files. That’s where git-crypt comes for help. It is an open source, command line utility that empowers developers to protect specific files within a git repository.

git-crypt enables transparent encryption and decryption of files in a git repository. Files which you choose to protect are encrypted when committed, and decrypted when checked out. git-crypt lets you freely share a repository containing a mix of public and private content. git-crypt gracefully degrades, so developers without the secret key can still clone and commit to a repository with encrypted files. This lets you store your secret material (such as keys or passwords) in the same repository as your code, without requiring you to lock down your entire repository.

The biggest advantage of git-crypt is that private data and public data can live in the same location.

Note: there are alternatives tools/approaches you can use to protect/encrypt data within a Git repository, listed at the end of this post


Pre-requisites

To use git-crypt, you need a working Git and GPG environnment

To reach this state:

    # List GPG keys for which you have both a public and private key.
    $> gpg --list-secret-keys --keyid-format LONG
    sec   rsa4096/5D08BCDD4F156AD7 2017-03-01 [C] [expires: 2019-08-27]
    [...]
    uid                 [ultimate] Sebastien Varrette <Sebastien.Varrette@uni.lu>
    uid                 [ultimate] Sebastien Varrette (Falkor) <Sebastien.Varrette@gmail.com>
    uid                 [ultimate] [jpeg image of size 3075]
    [...]

    #  Set your GPG signing key in Git
    $> git config --global user.signingkey 5D08BCDD4F156AD7

GPG Key Management

General recommendations / Best practices

  • create a 4096bit RSA key, with the sha512 hashing algorithm
  • Use the concept of GPG key subpairs
    • your primary key is only meant for certification / authentication purposes (in particular not for signing or encrypting).
  • Expiration date should be within less than two years.
    • You can always extend the key expiration as long as you still have access to the key, even after it has expired

This applies for your personnal GPG keyring on your laptop. You may be reluctant to transfer or share your primary key pair over a remote [computing] system, such as an HPC facility. To handle your GPG keys on such platform (for instance the UL HPC clusters, you have two alternatives:

  1. create a new key pair proper to each cluster, that you will sign with your primary key.
  2. create a subkey you will export on the remote facility.

Installation

On your local machine:

If you’re running Mac OS X – and assuming Homebrew is installed:

  $> brew install git-crypt

If you’re running Linux:

  ## Pre-requisites for compilation
  # Debian/Ubuntu
  $> apt-get install build-essential
  $> apt-get install libssl-dev
  $> apt-get install xsltproc
  #
  # CentOS 7
  $> yum groupinstall "Development tools"
  $> yum install openssl-devel
  $> yum install libxslt       # required to build man pages
  #
  ## Collect sources
  $> cd /usr/local/src
  $> wget https://www.agwa.name/projects/git-crypt/downloads/git-crypt-0.6.0.tar.gz
  $> wget https://www.agwa.name/projects/git-crypt/downloads/git-crypt-0.6.0.tar.gz.asc
  #
  ## Check file signature
  # Import PGP key -- see https://www.agwa.name/about/pgp.page
  $> gpg --recv-key 0xEF5D84C1838F2EB6D8968C0410378EFC2080080C
  # If above fail for firewall filtering reason, target a key servers answering to port 80
  $> gpg --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-key 0xEF5D84C1838F2EB6D8968C0410378EFC2080080C
  # Verify the signature file
  $> gpg --verify git-crypt-0.6.0.tar.gz.asc git-crypt-0.6.0.tar.gz
  #
  ## uncompress, compile and install
  $> tar xf git-crypt-0.6.0.tar.gz
  $> cd git-crypt-0.6.0/
  $> make ENABLE_MAN=yes          # requires xsltproc
  $> make ENABLE_MAN=yes install

Note: git-crypt is installed on the UL HPC platform


Initial Repository Setup and Configuration

Configure/setup a repository to use git-crypt as follows:

 $> git-crypt init

This will generate a symmetric key for encrypting your files (stored in .git/git-crypt/keys/default).

Then there are a couple of actions to perform, detailed below:

  1. claim ownership of the git-crypt vault
  2. bootstrap a .gitattributes file at the root of the repository defining the encryption policy for the files of the repository
  3. enable a custom git pre-commit hook (see doc to avoid accidentally adding unencrypted files – see issue #45.
  4. share this key with allowed collaborators through a commited version of its encrypted version using their respective GPG key (see git-crypt add-gpg-user)

Note you need of course to have imported the corresponding GPG key ID into your keyring

 $> gpg --search-keys <GPG_ID>    # Search and Import
 # If above fail for firewall filtering reason, target a key servers answering to port 80
 $> gpg --keyserver hkp://p80.pool.sks-keyservers.net:80 --search-keys <GPG_ID>

.gitattributes setup

Create and commit at the root of your repository a new file named .gitattributes with the following content:

 # -*- mode: conf -*-
 #
 # specify which files to encrypt using [git-crypt](https://www.agwa.name/projects/git-crypt/)
 #
 #
 # Certificate keys etc.
 *.cert filter=git-crypt diff=git-crypt
 *.key filter=git-crypt diff=git-crypt
 *.crt filter=git-crypt diff=git-crypt
 # Private document/folders
 # *PRIVATE*           filter=git-crypt diff=git-crypt
 # subfolder/**/*      filter=git-crypt diff=git-crypt

Note: you can find this template file on Github.

To automate the process from online sources:

 $> cd /path/to/repo
 $> wget https://raw.githubusercontent.com/Falkor/falkorlib/devel/templates/git-crypt/.gitattributes
 $> git add .gitattributes
 $> git commit -s -m 'Initialize .gitattributes for git-crypt' .gitattributes

Git pre-commit hook

You need also to setup a Pre-commit hook to avoid accidentally adding unencrypted files with git-crypt – see issue #45. You can find it as a gist:

 #!/bin/bash
 ################################################################################
 # See <https://gist.github.com/Falkor/848c82daa63710b6c132bb42029b30ef>
 # Pre-commit hook to avoid accidentally adding unencrypted files with [git-crypt](https://www.agwa.name/projects/git-crypt/)
 # Fix to [Issue #45](https://github.com/AGWA/git-crypt/issues/45)
 #
 # Usage:
 #    $> cd /path/to/repository
 #    $> git-crypt init
 #    $> curl <url/to/this/raw/gist> -o .git/hooks/pre-commit
 #    $> chmod +x .git/hooks/pre-commit
 #
 # Otherwise, you might want to add it as a git submodule, using:
 #    $> git submodule add https://gist.github.com/848c82daa63710b6c132bb42029b30ef.git config/hooks/pre-commit.git-crypt
 #    $> cd .git/hooks
 #    $> ln -s ../../config/hooks/pre-commit.git-crypt/pre-commit.git-crypt.sh pre-commit
 #
 if [ -d .git-crypt ]; then
     STAGED_FILES=$(git diff --cached --name-status | awk '$1 != "D" { print $2 }' | xargs echo)
     if [ -n "${STAGED_FILES}" ]; then
         git-crypt status ${STAGED_FILES} &>/dev/null
         if [[ $? -ne 0  ]]; then
             git-crypt status -e ${STAGED_FILES}
             echo '/!\ You should have first unlocked your repository BEFORE staging the above file(s)'
             echo '/!\ Proceed now as follows:'
             echo -e "\t git unstage ${STAGED_FILES}"
             echo -e "\t git crypt unlock"
             echo -e "\t git add ${STAGED_FILES}"
             exit 1
         fi
     fi
 fi

Recommended way to automate the installation (leaving the pre-commit hook script in a dedicated directory config/hooks/):

 $> cd /path/to/repo
 $> mkdir -p config/hooks
 $> curl https://gist.githubusercontent.com/Falkor/848c82daa63710b6c132bb42029b30ef/raw/610bac85ca512171d04b19d668098bd2678559a7/pre-commit.git-crypt.sh -o config/hooks/pre-commit.git-crypt.sh
 $> chmod +x config/hooks/pre-commit.git-crypt.sh
 $> git add  config/hooks/pre-commit.git-crypt.sh
 $> git commit -s -m "pre-commit hook for git-crypt" config/hooks/pre-commit.git-crypt.sh
 # bootstrapping special Git pre-commit hook for git-crypt
 $> ln -s ../../config/hooks/pre-commit.git-crypt.sh .git/hooks/pre-commit

(optional) Multiple key support

In addition to the implicit default key, git-crypt supports alternative keys which can be used to encrypt specific files and can be shared with specific GPG users. This is useful if you want to grant different collaborators access to different sets of files.

To generate an alternative key named <KEYNAME> and/or share it with a GPG user, pass the -k <KEYNAME> option to git-crypt { init | add-gpg-user} as follows:

 $> git-crypt init -k <KEYNAME>
 # Share it with a GPG user
 $> git-crypt add-gpg-user -k <KEYNAME> <GPG_ID>

To encrypt a file with an alternative key, use the git-crypt-<KEYNAME> filter in .gitattributes as follows:

 secretfile filter=git-crypt-KEYNAME diff=git-crypt-KEYNAME

git-crypt Usage

Unlock/lock the git-crypt vault

You can unlock the vault i.e. decrypt the encryption key using your personnal GPG ID by running

 $> git-crypt unlock

You can lock back the vault by running

 $> git-crypt lock

/!\ IMPORTANT thanks to the above configured Git pre-commit hook, you avoid having sensitive files (as filtered within the .gitattributes file) commited in cleartext while the git-crypt vault is locked.

Adding data sensitive file to the repository

  1. First you need to unlock the vault (if not yet done) with git-crypt unlock.
  2. Then specify files/wildcard patterns to encrypt by commpleting the .gitattributes file at the root of the repository filter=git-crypt diff=git-crypt
  3. commit the changes to the .gitattributes file.
  4. add and commit your file

Example of specifications within the .gitattributes file:

 # global wildcard pattern
 *.key               filter=git-crypt diff=git-crypt
 # Private document/folders
 # *PRIVATE*         filter=git-crypt diff=git-crypt
 #
 # ALL files under a certain sub directory
 subfolder/**/*      filter=git-crypt diff=git-crypt
 #
 # A specific file
 subdir/secretfile   filter=git-crypt diff=git-crypt

For instance at step 4, assuming you plan to add a *.key file (thus expected to be encrypted as per above .gitattributes policy), proceed as follows:

 # Eventually unlock the repository
 $> git-crypt unlock
 $> echo 'secret' > secret.key
 $> git add secret.key
 $> git commit -s -m "add secret.key (encrypted) file" secret.key

Note that thanks to the pre-commit hook, in case you have forgotten to unlock the repository, the above commit command would fail as follows:

 $> git commit -s -m "add secret.key (encrypted) file" secret.key
     encrypted: secret.key *** WARNING: staged/committed version is NOT ENCRYPTED! ***
 Warning: one or more files is marked for encryption via .gitattributes but
 was staged and/or committed before the .gitattributes file was in effect.
 Run 'git-crypt status' with the '-f' option to stage an encrypted version.

So assuming you did well, you can commit the file and check that the content is indeed encrypted:

 $> git commit -s -m "add secret.key (encrypted) file" secret.key
 [master 9ae570f] add secret.key (encrypted) file
  1 file changed, 0 insertions(+), 0 deletions(-)
  create mode 100644 secret.key
 $> cat secret.key
 secret
 #
 # Lock back the repository
 $> git-crypt lock
 $> cat secret.key
 GITCRYPT<XXXXXXXX>

Adding new collaborator to the vault

To grant access to the encrypted files stored in the repository to a collaborator, you first need to collect his GPG ID. You have several options at this level:

  1. query and import the GPG ID from the official GPG servers and carefully check it (assuming you do not have yet import it in your keyring)
  2. as distributing GPG keys can be cumbersome, rely on the keybase.io service to collect certified GPG ID from their username – see tutorial

     # Option 1 - traditional query and import the GPG ID
     $> gpg --search-keys <email>  # Search & Import
     # If above fail for firewall filtering reason, target a key servers answering to port 80
     $> gpg --keyserver hkp://p80.pool.sks-keyservers.net:80 --search-keys <email>
     #
     ## Option 2: using Keybase.io
     # curl + gpg pro tip: import svarrette's keys
     $> curl https://keybase.io/svarrette/pgp_keys.asc | gpg --import
     # the Keybase app can push to gpg keychain, too
     $> keybase pgp pull svarrette
     #
     # Now get the GPG ID
     $> gpg --list-key <email> | grep pub
    

Now you can share the repository with this GPG ID:

   # Check associated (imported) GPG identity
   $> gpg --list-key <email> | grep pub
   #
   # Add new git-crypt collaborator i.e. sign the
   #     encryption key with this GPG key and store it into '.git-crypt/'
   $> git crypt add-gpg-user <GPG_ID>
   [master (root-commit) a967527] Add 1 git-crypt collaborator
    2 files changed, 4 insertions(+)
    create mode 100644 .git-crypt/.gitattributes
    create mode 100644 .git-crypt/keys/default/0/<FINGERPRINT>.gpg

By default, git-crypt add-gpg-user will fail if there is no assurance that the key belongs to the named user. If you trust the key you imported (but did not commit this entitlement within your keyring by actually signing this key), you can use the --trusted option to enforce the operation to succeed:

 $> git crypt add-gpg-user --trusted <GPG_ID>

You can add as many collaborators as you wish.

Example:

 # Define the list of GPG ID of the collaborators
 # Ex: the Uni.lu HPC Team
 $> GPGKEYS="0x5D08BCDD4F156AD7
 0x07FEA8BA69203C2D
 0x37183CEF550DF40B
 0x3F3242C5B34D98C2
 0x6429C2514EBC4737"
 #
 # Check the associated primary identity
 $> parallel -j 1  echo '--- {} ---'\; gpg --list-key {} '|' grep uid '|' head -n 1 ::: $GPGKEYS
 --- 0x5D08BCDD4F156AD7 ---
 uid                   [ultimate] Sebastien Varrette <Sebastien.Varrette@uni.lu>
 --- 0x07FEA8BA69203C2D ---
 uid                   [  full  ] Clement Parisot <Clement.Parisot@uni.lu>
 --- 0x37183CEF550DF40B ---
 uid                   [  full  ] Hyacinthe Cartiaux <hyacinthe.cartiaux@uni.lu>
 --- 0x3F3242C5B34D98C2 ---
 uid                   [  full  ] Valentin Plugaru <Valentin.Plugaru@uni.lu>
 --- 0x6429C2514EBC4737 ---
 uid                   [  full  ] Sarah DIEHL <sarah.diehl@uni.lu>
 #
 # Allow them to access the vault...
 $> parallel -j 1  echo '--- {} ---'\; git-crypt add-gpg-user {} ::: $GPGKEYS

git-crypt alternatives

Password Management with pass

Another nice git-based approach that team nicely with GPG relies on pass, the standard unix password manager. Password are stored inside GPG encrypted files inside a simple directory tree, meant to become a password repository.

Then pass is an utility to insert, display or copy to clipboard passwords stored into this git repository. It is not mandatory to use it, but it eases password management.

Assuming you have set the environnment variables PASSWORD_STORE_{DIR,SIGNING_KEY}, the pass CLI usage can be summarized below:

 $> pass help      # Pass usage
 $> pass git pull  # Fetch latest passwords
 $> pass           # List passwords
 $> pass twitter/<accountname>         # Display a password
 $> pass -c twitter/<accountname>      # Copy a password to clipboard
 $> pass insert google/<accountname>   # Insert a new password
 $> pass git push                      # Push your changes

Notes: Git commit is done automatically by the pass utility. If you need to add comments in addition to the password, use the -m option to insert extra lines. A dedicated page will be made available for this tool.

Now if you are allergic to GnuPG and/or by extension git-crypt, here are a few other alternatives you can use to protect your sensitive data in a repository.

EncFS / GocryptFS / eCryptFS / Cryptomator / securefs / CryFS

All these open-source file encryption solutions for Linux (and thus Mac OS) are available. In contrast to disk-encryption software that operate on whole disks (TrueCrypt, dm-crypt etc), file encryption operates on individual files that can be backed up or synchronised easily, especially within a Git repository.

  • Comparison matrix
    • gocryptfs, aspiring successor of EncFS written in Go
    • EncFS, mature with known security issues
    • eCryptFS, integrated into the Linux kernel
    • Cryptomator, strong cross-platform support through Java and WebDAV
    • securefs, a cross-platform project implemented in C++.
    • CryFS, result of a master thesis at the KIT University that uses chunked storage to obfuscate file sizes.

Assuming your working copy is stored in /path/to/repo, your workflow (mentionned below for EncFS, but it can be adpated to all the other tools) operated on encrypted vaults and would be as follows:

  • you ignore the mounting directory (ex: vault/*) in the root .gitignore of the repository
    • this ensures neither you nor a collaborator will commit any unencrypted version of a file by mistake
  • you commit only the EncFS / GocryptFS / eCryptFS / Cryptomator / securefs / CryFS raw directory (ex: .crypt/) in your repository.
    • thus only encrypted form or your files are commited
  • You create the EncFS / GocryptFS / eCryptFS / Cryptomator / securefs / CryFS encrypted vault
  • You prepare macros/scripts/Makefile/Rakefile tasks to lock/unlock the vault on demand

Here are for instance a few example of these operations in live (for EncFS, adapt accordingly)

 $> cd /path/to/repo
 $> rawdir=.crypt      # /!\ ADAPT accordingly
 $> mountdir=vault     # /!\ ADAPT accordingly
 #
 # Ignore the mount dir
 $> echo $mountdir >> .gitignore
 #
 # Creation of an EncFS vault (only once)
 $> encfs --standard $rawdir $mountdir
 #
 # OR
 # Creation of a GoCryptFS vault (only once)
 $> gocryptfs -init $rawdir
Tool OS Opening/Unlocking the vault Closing/locking the vault
EncFS Linux encfs -o nonempty --idle=60 $rawdir $mountdir fusermount -u $mountdir
EncFS Mac OS encfs --idle=60 $rawdir $mountdir umount $mountdir
GocryptFS   gocryptfs $rawdir $mountdir as above

A dedicated page is available for this approach – see this page.

Note: In a Puppet control repository relying on hiera, you can use the hiera-eyaml format.

File Encryption using SSH [RSA] Key Pairs

See dedicated post.