Skip to content

User Guide: Privacera Encryption Integration with Databricks Unity Catalog

This user guide explains how to integrate Privacera Encryption with Databricks Unity Catalog to enable encryption, decryption, and data masking. It provides step-by-step instructions to help you configure and use these features effectively.

Prerequisites

  • Databricks Unity Catalog enabled workspace.
  • Privacera Encryption for Databricks Unity Catalog must be enabled and integrated (single-user or shared cluster setup).
  • Encryption, presentation, and masking schemes created in Privacera Portal.
  • Scheme policies configured so that users, groups, or roles have Protect, Unprotect, and Mask permissions as needed.

For detailed prerequisites and setup by cluster type, see:


1. Create System and Custom Schemes

1.1. Create System Schemes (Encryption, Presentation)

  1. Log in to the Privacera Portal
  2. Navigate to Encryption & Masking > Encryption & Masking
  3. Click Generate System Scheme
  4. Confirm the creation by clicking Yes
  5. System schemes will be created for:
    • Encryption
    • Presentation
  6. Review the list of default system schemes

1.2. Create Custom Schemes (Encryption, Presentation, Masking)

  1. Navigate to Encryption & Masking > Schemes
  2. Click Add Scheme
  3. Select the scheme type: Encryption, Presentation, or Masking
  4. Enter the required details and click Save

2. Create Scheme Policies

Users and roles must have scheme policies to call the encryption UDFs.

2.1. Protect Access (for encrypting data)

  1. Navigate to Access Management > Scheme Policies > PEG (privacera_peg)
  2. Under ACCESS, click Add New Policy
  3. Enter a policy name (e.g., Protect Access)
  4. Under Encryption Schemes, select the encryption schemes (e.g., SYSTEM_SSN, SYSTEM_EMAIL, SYSTEM_CREDITCARD, SYSTEM_ADDRESS)
  5. Under Grant Permissions(s), Grant Protect to the User
  6. Click Save

2.2. Unprotect Access (for decrypting data)

  1. Navigate to Access Management > Scheme Policies > PEG (privacera_peg)
  2. Under ACCESS, click Add New Policy
  3. Enter a policy name (e.g., Unprotect Access)
  4. Under Encryption Schemes, select the same encryption (and optionally presentation) schemes
  5. Under Grant Permissions(s), Grant Unprotect to the User
  6. Click Save

2.3. Mask Access (for masking data)

  1. Navigate to Access Management > Scheme Policies > PEG (privacera_peg)
  2. Under ACCESS, click Add New Policy
  3. Enter a policy name (e.g., Mask Policy)
  4. Under Protect Scheme, enter the masking schemes (e.g., MASK_SSN, MASK_EMAIL, MASK_ADDRESS)
  5. Under Grant Permission(s), Grant Mask to the User
  6. Click Save

3. Encrypt Data

Use the protect UDF in your catalog and schema where the UDFs were created. Replace <catalog>.<schema> with your UDF location (e.g., hive_metastore.privacera for single-user, or your UC catalog and schema for shared cluster).

Example: Create a table with sample data and encrypt a column

SQL
-- Create source table
CREATE TABLE IF NOT EXISTS <catalog>.<schema>.source_data (
  id INT,
  name STRING,
  email STRING
);

-- Insert sample data
INSERT INTO <catalog>.<schema>.source_data VALUES
  (1, 'Alice', 'alice@example.com'),
  (2, 'Bob', 'bob@example.com');

-- Encrypt the email column and create table (CTAS)
CREATE TABLE <catalog>.<schema>.encrypted_data AS
SELECT
  id,
  name,
  <catalog>.<schema>.protect(email, 'SYSTEM_EMAIL') AS encrypted_email
FROM <catalog>.<schema>.source_data;

4. Decrypt Data

4.1. Decrypt without presentation scheme

Apply unprotect on the CTAS-created encrypted_data table:

SQL
1
2
3
4
5
6
CREATE TABLE <catalog>.<schema>.decrypted_data AS
SELECT
  id,
  name,
  <catalog>.<schema>.unprotect(encrypted_email, 'SYSTEM_EMAIL', NULL) AS decrypted_email
FROM <catalog>.<schema>.encrypted_data;

4.2. Decrypt with presentation scheme (formatted/obfuscated output)

SQL
1
2
3
4
5
SELECT
  id,
  name,
  <catalog>.<schema>.unprotect(encrypted_email, 'SYSTEM_EMAIL', 'SYSTEM_PRESENTATION_EMAIL') AS decrypted_email
FROM <catalog>.<schema>.encrypted_data;

5. Mask Data

Use the mask UDF with a masking scheme on plaintext columns:

SQL
1
2
3
4
5
6
SELECT
  col1,
  col2,
  <catalog>.<schema>.mask(email, 'MASK_EMAIL') AS masked_email
FROM <catalog>.<schema>.<table>
LIMIT 10;

6. Summary

Operation UDF Typical use
Encrypt protect Write sensitive data in encrypted form
Decrypt (without presentation) unprotect Read encrypted data (plain decryption)
Decrypt (with presentation) unprotect Read encrypted data with presentation scheme (obfuscated output)
Mask mask Show masked values on plaintext data

For cluster-specific setup, UDF creation, and troubleshooting, see the Databricks Unity Catalog Encryption connector documentation.