Privacera Platform master publication

Basic setup for Databricks encryption and masking
:

This section describes how to install and configure the Privacera Encryption jar file UDF in Privacera Manager Databricks to create UDFs for encryption and masking and to create policies for users and groups.

The overall approach is as follows:

  1. Install the Privacera Manager Encryption Jar in Databricks with the Databricks CLI or UI

  2. Upload Privacera Manager configuration files to Databricks

  3. Define UDFs in Databricks to call the Privacera Manager encryption protect and unprotect methods.

Prerequisites
  1. In Databricks, make sure that the users who will use the UDFs have sufficient access to write the pertinent tables.

  2. In Privacera Manager, make sure to configure the Databricks datasource: Databricks Spark Plugin (Python/SQL) on AWS, Azure, or GCP.

  3. In Privacera Manager, make sure that Privacera Encryption has been enabled.

  4. In Privacera Manager, make sure that the users who will use the UDFs in Databricks have been given permission to access the encryption scheme policies that are part of the UDF syntax.

  5. In Privacera Manager, make sure that these same users have been given permission to access the encryption keys in the Ranger KMS.

Methods for Installing Encryption jar

You can install the Privacera encryption jar file in the following ways:

After you install the jar file, you need to define some configuration properties and User-Defined Functions (UDFs) to call the Privacera encryption /protect and /unprotect API endpoints.

Install Encryption jar via Databricks CLI
  1. Download the jar to a local machine.

    The variable PRIVACERA_BASE_DOWNLOAD_URL depends on the version of the Privacera software you want. See Configure and Install Core Services.

    export PRIVACERA_BASE_DOWNLOAD_URL=$<PRIVACERA_BASE_DOWNLOAD_URL>
    wget $<PRIVACERA_BASE_DOWNLOAD_URL>/privacera-crypto-jar-with-dependencies.jar -O privacera-crypto-jar-with-dependencies.jar
    
  2. Upload the jar file to DBFS or an S3 location from where the Databricks cluster can access it.

  3. With the Databricks CLI, upload the jar into DBFS:

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/jars
    databricks fs cp privacera-crypto-jar-with-dependencies.jar dbfs:/privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar
Install Encryption jar via Databricks UI
  1. Go to the Databricks cluster details page: Clusters > cluster name > Libraries.

  2. Click Install > New.

  3. Drop or upload the jar file.

    dbfs:/privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar

    Wait until the jar file is installed.

Create and Upload Encryption Configuration Files

The steps here rely on the default location of the Privacera crypto properties file. However, you can change this location to a directory of your choice. Follow the steps here and then see Custom Path to Crypto Properties File in Databricks.

  1. Create the configuration file on your local machine. In the next step, upload the file to the Databricks cluster.

    mkdir -p privacera/crypto/configs
    cd privacera/crypto/configs
     # Edit the crypto_default.properties file to set the following variables. 
    vi crypto_default.properties
    privacera.portal.base.url=http://<APP_HOSTNAME.>:6868 
    privacera.portal.username=<SOME_USERNAME>
    privacera.portal.password=<SOME_PASSWORD>
     # Mode of encryption/decryption: rpc or native
    privacera.crypto.mode=native
    
  2. Upload the configuration file to DBFS.

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/configs
    databricks fs cp crypto_default.properties dbfs:/privacera/crypto/configs/crypto_default.properties