Skip to content

Setup - Single-User Cluster Encryption

This guide provides instructions for setting up Privacera Encryption on Databricks Unity Catalog single-user clusters using Java UDFs.

Prerequisites

Before starting, ensure all Common Prerequisites are fulfilled or verified.

Step 1: Create Init Script

Create an init script based on your deployment type.

  1. Open Databricks Web UI
  2. Navigate to Workspace files
  3. Create an init script file (e.g., /Workspace/Shared/encryption/init_script.sh) with the following content:
Bash
#!/bin/bash
# Use the value set during the installation of Privacera
export DEPLOYMENT_ENV_NAME="<DEPLOYMENT_ENV_NAME>"

set -x

export ENABLE_SSL=true

cp /dbfs/<deployment_path>/${DEPLOYMENT_ENV_NAME}/privacera-dbx-udf-pegv2-conf/peg_init_script.sh peg_init_script.sh
chmod +x peg_init_script.sh
./peg_init_script.sh

Note

  • Replace <DEPLOYMENT_ENV_NAME> with the same value as configured in vars.privacera.yml at Privacera Manager host.
  • Replace <deployment_path> with your actual deployment path.
  • Ensure custom-vars has vars.databricks.plugin.yml with DATABRICKS_MANAGE_INIT_SCRIPT: "true".
  1. Navigate to Privacera Portal
  2. Go to Application > Databricks
  3. Download the script privacera_databricks.sh
  4. Modify the script by commenting out the following lines:
Bash
1
2
3
4
5
6
# Comment out these lines
#wget "${API_SERVER_URL}/${API_KEY}/databricks/init_saas.sh" -O init_saas.sh
#chmod a+x init_saas.sh
#./init_saas.sh "${API_KEY}" "${API_SERVER_URL}"
#cp /databricks/jars/ranger-spark-plugin-impl/ranger-spark-plugin-*.jar /databricks/jars/
#echo "`date`: Script Execution End"
  1. Add the following content at the end of the script:
Bash
1
2
3
4
5
6
7
8
export PEG_SETUP_SCRIPT_DOWNLOAD_URL="${API_SERVER_URL}/${API_KEY}/peg_scripts"
export PEG_INTEGRATION_TYPE="databricks"
echo "Downloading peg_setup.sh to databricks" >> ${OUTPUT_FILE}
wget ${PEG_SETUP_SCRIPT_DOWNLOAD_URL}/peg_setup.sh -O peg_setup.sh
chmod a+x peg_setup.sh
./peg_setup.sh

echo "`date`: Script Execution End"

Note

Ensure encryption is enabled under account settings in Privacera Portal.

Step 2: Create Single-User Cluster

  1. In Databricks, create a new single-user cluster
  2. Add the init script:
    • Select Workspace
    • Enter the path to your init script (e.g., /Workspace/Shared/encryption/init_script.sh)
  3. Save and start the cluster
  4. Wait for the cluster to start successfully

Step 3: Create UDFs

Once the cluster is running with the Privacera init script, execute the following SQL commands to create the UDFs.

Note

This is a one-time setup that can be executed from any cluster with the init script.

Create Database

SQL
CREATE DATABASE IF NOT EXISTS hive_metastore.privacera;

Create UDFs

Check if functions exist, drop them if they do, then create new ones:

SQL
-- Check if functions exist
SHOW FUNCTIONS IN hive_metastore.privacera LIKE 'protect';
SHOW FUNCTIONS IN hive_metastore.privacera LIKE 'unprotect';
SHOW FUNCTIONS IN hive_metastore.privacera LIKE 'mask';

-- Drop existing functions if they exist
DROP FUNCTION IF EXISTS hive_metastore.privacera.protect;
DROP FUNCTION IF EXISTS hive_metastore.privacera.unprotect;
DROP FUNCTION IF EXISTS hive_metastore.privacera.mask;

-- Create UDFs
CREATE FUNCTION hive_metastore.privacera.protect AS 'com.privacera.encryption.hive.PrivaceraEncryptUDF';
CREATE FUNCTION hive_metastore.privacera.unprotect AS 'com.privacera.encryption.hive.PrivaceraDecryptUDF';
CREATE FUNCTION hive_metastore.privacera.mask AS 'com.privacera.encryption.hive.PrivaceraMaskUDF';

Step 4: Create Scheme Policy

Add a scheme policy so that users, groups, or roles can call the encryption UDFs. Without this policy, calls to protect, unprotect, or mask will be denied.

  1. In Privacera Portal, go to Access Management > Scheme Policies
  2. Choose the PEG service context
  3. Click Add New Policy
  4. Enter a policy name and description
  5. Select the target encryption (and optionally presentation or masking) scheme(s)
  6. Assign Protect and Unprotect to the users, groups, or roles that will run the UDFs; if using masking, also assign Mask
  7. Save the policy

Step 5: Using Encryption UDFs

Encrypt Data

SQL
1
2
3
4
5
6
SELECT 
  col1, 
  col2, 
  hive_metastore.privacera.protect(col3, 'SCHEME_NAME') AS encrypted_col
FROM 
  catalog.schema.table;

Decrypt Data

Without presentation scheme:

SQL
1
2
3
4
5
6
SELECT 
  col1, 
  col2, 
  hive_metastore.privacera.unprotect(encrypted_col3, 'SCHEME_NAME') AS decrypted_col3
FROM 
  catalog.schema.table;

With presentation scheme:

SQL
1
2
3
4
5
6
SELECT 
  col1, 
  col2, 
  hive_metastore.privacera.unprotect(encrypted_col3, 'SCHEME_NAME', 'PRESENTATION_SCHEME_NAME') AS decrypted_col3
FROM 
  catalog.schema.table;

Mask Data

SQL
1
2
3
4
5
6
SELECT 
  col1, 
  col2, 
  hive_metastore.privacera.mask(col3, 'MASKING_SCHEME_NAME') AS masked_col3
FROM 
  catalog.schema.table;