Tutorial: Creating an Encapsulated Encrypted Data Set
This describes how to create a self-contained encrypted data set within a Java Maven project, encrypted using public-key encryption. The result is that a user who is authorized to access the data can do so simply by adding the project as a Maven dependency in their own project, without needing to do any explicit key management.
A full example of an encapsulated encrypted data set project is available on GitHub.
Initial assumptions
- The Maven project
ciesvium
has been cloned locally, with the same parent directory asdata-test
, and built usingmvn compile
. - The current working directory is the root directory of
ciesvium
. - The data to be encrypted is initially stored in the plain-text non-encrypted CSV file
~/plain_text.csv
. - The encrypted data is to be stored in an existing Maven project
data-test
within the Java packageuk.ac.standrews.cs.data
.
Generate public and private keys
If you don’t already have a PEM key pair, create one. For example, using OpenSSL on Unix:
pushd ~/.ssh openssl genrsa -out private_key.pem 2048 chmod 600 private_key.pem openssl rsa -in private_key.pem -pubout > public_key.pem popd
The key pair will be used to encrypt and decrypt a symmetric (AES) key, which will itself be used to encrypt and decrypt the data.
Add public keys to the project
Public keys for the users authorized to access the encrypted data can be stored in a resource file within the project. It’s not essential to keep this file here, but it makes things simpler to keep track of. Solely for documentation purposes, first add a user identifier (e.g. email address), which is ignored by the code:
mkdir -p ../data-test/src/main/resources/uk/ac/standrews/cs/data echo graham.kirby@st-andrews.ac.uk >> ../data-test/src/main/resources/uk/ac/standrews/cs/data/authorized_keys.txt
Copy your public key file:
cat ~/.ssh/public_key.pem >> ../data-test/src/main/resources/uk/ac/standrews/cs/data/authorized_keys.txt
Example:
graham.kirby@st-andrews.ac.uk -----BEGIN PUBLIC KEY----- MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzTDV8GGUcZByuw2zRu8+ SEbJTg+lT9Vx8H+5N/BNUViHVZb+zToQdzwnRE2vqQAdRfLwoNBHoiD+buUivy+l 2QOizY9Qs9X4952yWeGeSU8zo/hImtyM5vAi9nG+llKuFRHv3S7GKJW1shIuauG3 9dRWvSzDDhJaGTuH/gG0WPw0k+7sR3t473R5DD5bfx2SVprGPWP9r4ETo2u5Qqw+ 7/pkLOdKw46qlMGVV/NlrEq89gpRenbQ8fSKHhakIhIcAMMmImqpTzbhidA7cMe/ HIE9ckCBYundUJOZD7L7AZCbxkKmscxtlljaWyqIGg79pOF++dD9NOSuSL35IIgr twIDAQAB -----END PUBLIC KEY-----
Repeat with the public keys for any other users who should be able to access the data.
Generate new symmetric key
Generate a new AES key and encrypt it separately using each of the authorized public keys, storing the resulting encrypted versions in a resource file:
src/main/scripts/generate-and-encrypt-aes-key.sh ../data-test/src/main/resources/uk/ac/standrews/cs/data/authorized_keys.txt ../data-test/src/main/resources/uk/ac/standrews/cs/data/encrypted_key.txt
Example:
UX3+4gkpe51+9tOhDBnaQ/7JIjPylqdhruQL3kzAHYBPJkrQwVqwcDQYDAHqcaE5+00XHXkb1HiT /vO7W2HmAT8mkJMBVje054KXJ7SM1RRAwcKaUI6oXVjs/qJx0ZZszn19SMPTaBxjrS9suwnUZD9+ NXkEAHiBlsO3Jg5+ef/OQcAaVco6qgyfmMUuWP0PmnhkE7u2dIlp4nK7CV6fzTDs9cHL81qAba4H igOn3LBekVK9O1ka8OJPxJVM1NvQahoV2Cf1zgO79htVIlrDJULU2e1DNhYhaIe+YR6Zs1udVipN WKU0p+JREtn0y8WHHhg8NVg5FtvwwHuv7sMx4A==
If you’re curious, you can print out the AES key:
src/main/scripts/decrypt-aes-key.sh ../data-test/src/main/resources/uk/ac/standrews/cs/data/encrypted_key.txt
Don’t add this to the project!
Encrypt the data
Encrypt the data file using the encrypted AES key, storing the resulting encrypted version in a resource file:
src/main/scripts/encrypt-file-with-encrypted-aes-key.sh ../data-test/src/main/resources/uk/ac/standrews/cs/data/encrypted_key.txt ~/plain_text.csv ../data-test/src/main/resources/uk/ac/standrews/cs/data/plain_text.csv.enc
Define a data access class
Create a class in data-test
to access the encrypted data, containing references to the encrypted data file
and the encrypted versions of the key for the data. The package containing the class needs to correspond exactly to the
the directory structure containing the resource files:
package uk.ac.standrews.cs.data; import uk.ac.standrews.cs.utilities.dataset.encrypted.EncryptedDataSet; public class ExampleDataSet extends EncryptedDataSet { public ExampleDataSet() throws Exception { super( ExampleDataSet.class.getResourceAsStream("plain_text.csv.enc"), ExampleDataSet.class.getResourceAsStream("encrypted_key.txt")); } }
Install and/deploy the project as appropriate. Authorized users will now be able to be access the data by instantiating this class.
Use the encrypted data
In some arbitrary class:
package uk.ac.standrews.cs.test; import uk.ac.standrews.cs.data.ExampleDataSet; import uk.ac.standrews.cs.utilities.dataset.DataSet; public class ExampleDataSetUse { public static void main(String[] args) throws Exception { DataSet my_data = new ExampleDataSet(); my_data.print(System.out); } }
The full project containing this example is available on GitHub.