Skip to content

Advanced Configuration for Access Management for Databricks all-purpose compute clusters with Fine-Grained Access Control (FGAC)

JWT Auth Configuration

By Default, Privacera uses the Databricks login user for authorization. However, we also support JWT (JSON Web Token) integration, which will use the user/group from the JWT payload instead of the Databricks login user.

Here are the steps to configure JWT token integration

Configuration

Prerequisite:

  • The username used in the client_id and the group names used in the scope payload of the JWT token should be created in the Users/Groups/Roles section of the Privacera Access Management Portal.
  • These users or groups should be given the required permissions in the Ranger Policies for access control.

Need to set below common properties in Spark configuration of Databricks cluster:

Bash
spark.hadoop.privacera.jwt.oauth.enable true
spark.hadoop.privacera.jwt.token /tmp/jwttoken.dat

SSH to the instance where Privacera Manager is installed.

To enable JWT copy vars.jwt-auth.yaml from sample-vars to custom-vars

Bash
1
2
3
cd ~/privacera/privacera-manager/config
cp sample-vars/vars.jwt-auth.yaml custom-vars
vi custom-vars/vars.jwt-auth.yaml

Static public key JWT:

  1. Configure static public key
    • Add below properties in vars.jwt-auth.yaml file. For a single static key, use only one entry in JWT_CONFIGURATION_LIST.For multiple static keys, add multiple entries:
      YAML
      JWT_CONFIGURATION_LIST:
      - index: 0
        issuer: "https://example.com/issuer"
        # subject: "<PLEASE_CHANGE>"
        # secret: "<PLEASE_CHANGE>"
        userKey: "client_id"
        groupKey: "scope"
        parserType: "PING_IDENTITY"
        publickey: "jwttoken1.pub"
      
      - index: 1
        issuer: "https://example.com/issuer2"
        # subject: "<PLEASE_CHANGE>"
        # secret: "<PLEASE_CHANGE>"
        userKey: "client_id"
        groupKey: "scope"
        parserType: "PING_IDENTITY"
        publickey: "jwttoken2.pub"
      
    • Add public keys in JWT token files in config/custom-properties:
      Bash
      1
      2
      3
      4
      cd ~/privacera/privacera-manager/config/custom-properties
      
      vi jwttoken1.pub
      vi jwttoken2.pub
      
    • For a single static key, you only need to create one jwttoken.pub file.
    • Once the properties are configured, run the Privacera Manager setup and install actions Refer this
    • Use the updated ranger_enable.sh script in Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.

Dynamic Public Key JWT:

  1. Configure Dynamic Public Key

    • Add the following properties to the vars.jwt-auth.yaml file:

      YAML
      JWT_CONFIGURATION_LIST:
      - index: 0
        issuer: "https://example.com/issuer"
        # subject: "<PLEASE_CHANGE>"
        # secret: "<PLEASE_CHANGE>"
        userKey: "client_id"
        groupKey: "scope"
        parserType: "PING_IDENTITY"
      
        pubKeyProviderEndpoint: "https://<JWKS-provider>/get_public_key?kid="
        pubKeyProviderAuthType: "BASIC"
        pubKeyProviderAuthUserName: "<username>"
        pubKeyProviderAuthTypePassword: "<password>"
        pubKeyProviderJsonResponseKey: "x5c"
        jwtTokenProviderKeyId: "kid"
      

    • Once the properties are configured, run the following commands to generate and upload the configuration:

      Bash
      1
      2
      3
      cd ~/privacera/privacera-manager
      
      ./privacera-manager.sh post-install
      

    • Use the updated ranger_enable.sh script in Databricks cluster creation.

    • Click on Start or, if the cluster is running, click on Confirm and Restart.

Static and Dynamic public keys JWT:

  1. Configure Static and Dynamic public keys
    • Add below properties in vars.jwt-auth.yaml file:
      YAML
      JWT_CONFIGURATION_LIST:
      - index: 0
        issuer: "https://example.com/issuer"
        # subject: "<PLEASE_CHANGE>"
        # secret: "<PLEASE_CHANGE>"
        userKey: "client_id"
        groupKey: "scope"
        parserType: "PING_IDENTITY"
        publickey: "jwttoken.pub"
      
      - index: 1
        issuer: "https://example.com/issuer"
        # subject: "<PLEASE_CHANGE>"
        # secret: "<PLEASE_CHANGE>"
        userKey: "client_id"
        groupKey: "scope"
        parserType: "PING_IDENTITY"
      
        pubKeyProviderEndpoint: "https://<JWKS-provider>/get_public_key?kid="
        pubKeyProviderAuthType: "BASIC"
        pubKeyProviderAuthUserName: "<username>"
        pubKeyProviderAuthTypePassword: "<password>"
        pubKeyProviderJsonResponseKey: "x5c"
        jwtTokenProviderKeyId: "kid"
      
    • Add static JWT public key in jwt token file in config/custom-properties
      Bash
      1
      2
      3
      cd ~/privacera/privacera-manager/config/custom-properties
      
      vi jwttoken.pub
      
    • Once the property is configured, run the following commands to generate and upload the configuration:
      Bash
      1
      2
      3
      cd ~/privacera/privacera-manager
      
      ./privacera-manager.sh post-install
      
    • Use the updated ranger_enable.sh script in Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.

Set the below common properties in the Spark configuration of the Databricks cluster:

Bash
spark.hadoop.privacera.jwt.oauth.enable true
spark.hadoop.privacera.jwt.token /tmp/jwttoken.dat

Static public key JWT:

  1. Copy JWT Public Keys to Local Cluster File Path

    • Upload the JWT Public Key:
      • First, upload the jwttoken.pub file containing the JWT public key to the DBFS or workspace location.
      • For example, upload the key to /dbfs/user/jwt/keys.
    • Update the Init Script:
      • To copy the public keys to the local cluster file path, update the init script with the following commands:
        Bash
        1
        2
        3
        4
        export JWT_TOKEN_PUBLIC_KEY_DBFS_PATH="/dbfs/user/jwt/keys/."
        export JWT_TOKEN_PUBLIC_KEY_LOCAL_PATH="/tmp"
        
        cp -r ${JWT_TOKEN_PUBLIC_KEY_DBFS_PATH} ${JWT_TOKEN_PUBLIC_KEY_LOCAL_PATH}
        
      • This script sets the paths for the public keys in DBFS and the local cluster, then copies the keys from DBFS to the local path.
  2. Configure single static public key

    • Add below properties in the Spark configuration of the Databricks cluster along with the common properties:
      Bash
      1
      2
      3
      4
      5
      spark.hadoop.privacera.jwt.0.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.0.token.userKey client_id
      spark.hadoop.privacera.jwt.0.token.groupKey scope
      spark.hadoop.privacera.jwt.0.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.0.token.publickey /tmp/jwttoken0.pub
      
    • Save the changes and click on Start or, if the cluster is running, click on Confirm and Restart.
  3. Configure multiple static public keys

    • Add below properties in Spark configuration of Databricks cluster along with the common properties:
      Bash
      spark.hadoop.privacera.jwt.0.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.0.token.userKey client_id
      spark.hadoop.privacera.jwt.0.token.groupKey scope
      spark.hadoop.privacera.jwt.0.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.0.token.publickey /tmp/jwttoken.pub
      
      spark.hadoop.privacera.jwt.1.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.1.token.userKey client_id
      spark.hadoop.privacera.jwt.1.token.groupKey scope
      spark.hadoop.privacera.jwt.1.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.1.token.publickey /tmp/jwttoken1.pub
      
      spark.hadoop.privacera.jwt.2.token.parserType KEYCLOAK
      spark.hadoop.privacera.jwt.2.token.userKey client_id
      spark.hadoop.privacera.jwt.2.token.groupKey scope
      spark.hadoop.privacera.jwt.2.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.2.token.publickey /tmp/jwttoken2.pub
      
    • Save the changes and click on Start or, if the cluster is running, click on Confirm and Restart.

Dynamic public key JWT:

  1. Configure single dynamic public key

    • Add below properties in Spark configuration of Databricks cluster along with common properties:
      Bash
      spark.hadoop.privacera.jwt.0.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.0.token.userKey client_id
      spark.hadoop.privacera.jwt.0.token.groupKey scope
      spark.hadoop.privacera.jwt.0.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.0.token.publickey.provider.url https://<JWKS-provider>/get_public_key?kid=
      spark.hadoop.privacera.jwt.0.token.publickey.provider.auth.type basic
      spark.hadoop.privacera.jwt.0.token.publickey.provider.auth.username <username>
      spark.hadoop.privacera.jwt.0.token.publickey.provider.auth.password <password>
      spark.hadoop.privacera.jwt.0.token.publickey.provider.response.key x5c
      spark.hadoop.privacera.jwt.0.token.publickey.provider.key.id kid
      
    • Save the changes and click on Start or, if the cluster is running, click on Confirm and Restart.
  2. Configure multiple dynamic public keys

    • Add below properties in the Spark configuration of Databricks cluster along with the common properties:
      Bash
      spark.hadoop.privacera.jwt.0.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.0.token.userKey client_id
      spark.hadoop.privacera.jwt.0.token.groupKey scope
      spark.hadoop.privacera.jwt.0.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.0.token.publickey.provider.url https://<JWKS-provider>/get_public_key?kid=
      spark.hadoop.privacera.jwt.0.token.publickey.provider.auth.type basic
      spark.hadoop.privacera.jwt.0.token.publickey.provider.auth.username <username>
      spark.hadoop.privacera.jwt.0.token.publickey.provider.auth.password <password>
      spark.hadoop.privacera.jwt.0.token.publickey.provider.response.key x5c
      spark.hadoop.privacera.jwt.0.token.publickey.provider.key.id kid
      
      spark.hadoop.privacera.jwt.1.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.1.token.userKey client_id
      spark.hadoop.privacera.jwt.1.token.groupKey scope
      spark.hadoop.privacera.jwt.1.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.1.token.publickey.provider.url https://<JWKS-provider>/get_public_key?kid=
      spark.hadoop.privacera.jwt.1.token.publickey.provider.auth.type basic
      spark.hadoop.privacera.jwt.1.token.publickey.provider.auth.username <username>
      spark.hadoop.privacera.jwt.1.token.publickey.provider.auth.password <password>
      spark.hadoop.privacera.jwt.1.token.publickey.provider.response.key x5c
      spark.hadoop.privacera.jwt.1.token.publickey.provider.key.id kid
      
    • Save the changes and click on Start or, if the cluster is running, click on Confirm and Restart.

Static and Dynamic public keys JWT:

  1. Configure static and dynamic public keys
    • Add below properties in the Spark configuration of Databricks cluster along with the common properties:
      Bash
      spark.hadoop.privacera.jwt.0.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.0.token.userKey client_id
      spark.hadoop.privacera.jwt.0.token.groupKey scope
      spark.hadoop.privacera.jwt.0.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.0.token.publickey /tmp/jwttoken0.pub
      
      spark.hadoop.privacera.jwt.1.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.1.token.userKey client_id
      spark.hadoop.privacera.jwt.1.token.groupKey scope
      spark.hadoop.privacera.jwt.1.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.1.token.publickey.provider.url https://<JWKS-provider>/get_public_key?kid=
      spark.hadoop.privacera.jwt.1.token.publickey.provider.auth.type basic
      spark.hadoop.privacera.jwt.1.token.publickey.provider.auth.username <username>
      spark.hadoop.privacera.jwt.1.token.publickey.provider.auth.password <password>
      spark.hadoop.privacera.jwt.1.token.publickey.provider.response.key x5c
      spark.hadoop.privacera.jwt.1.token.publickey.provider.key.id kid
      
      spark.hadoop.privacera.jwt.2.token.parserType PING_IDENTITY
      spark.hadoop.privacera.jwt.2.token.userKey client_id
      spark.hadoop.privacera.jwt.2.token.groupKey scope
      spark.hadoop.privacera.jwt.2.token.issuer https://example.com/issuer
      spark.hadoop.privacera.jwt.2.token.publickey /tmp/jwttoken1.pub
      
    • Save the changes and click on Start or, if the cluster is running, click on Confirm and Restart.

Validation

  1. Prerequisites:
    • A running Databricks cluster secured with the above steps.
  2. Steps to Validate:
    • Login to Databricks.
    • Create or open an existing notebook. Associate the Notebook with the running Databricks cluster.
    • To use JWT in Privacera Databricks integration, you need to copy the JWT token file or string to the cluster's local file. To do this, use the following commands and replace <jwt_token> with your actual jwt token value.
      Python
      1
      2
      3
      4
      5
      6
      7
      8
      9
      jwt_file_path="/tmp/jwttoken.dat"
      token="<jwt_token>"
      file1 = open(jwt_file_path,"w")
      file1.write(token)
      file1.close()
      
      # Check the file content
      f = open(jwt_file_path,"r")
      print(f.read())
      
    • Use the following PySpark commands to verify S3 CSV file read access.
      Python
      1
      2
      3
      4
      5
      6
      7
      8
      # Define the S3 path to your file
      s3_path = "s3a://your-bucket-name/path/to/your/file"
      
      # Read the CSV file from the specified S3 path
      df = spark.read.format("csv").option("header", "true").load(s3_path)
      
      # Display the first 5 rows of the dataframe
      df.show(5)
      
    • On the Privacera portal, go to Access Management -> Audits
    • Check for the User that you mentioned in the Payload while Creating the JWT Token, e.g., jwt_user.
    • Check for the success or failure of the resource policy. A successful access is indicated as Allowed and a failure is indicated as Denied.

Use Custom Service repo

Creating a Service repo

We have to add these custom services outside the security zone. Inside the security zone, this will not work.

Let’s assume you want to create a new service repo with the prefix as “dev”. Perform the following steps to create a custom s3 Ranger policy repo. Follow the same steps to add other custom services for Hive, Files, Adls, etc.

  1. Login to Privacera portal.
  2. Go to Access Management -> Resource Policies.
  3. Under s3, click the more icon .
  4. Select Add Service.
  5. Under Add Service, provide values for the following fields:
    • Service Name: Provide name for the service. For example, 'dev_s3'.
    • Click the toggle to turn on the Active Status.
    • Under Select Tag Service, select 'privacera_tag' from the drop-down list.
    • Provide username as 's3'.
    • Provide Common Name for Certificate as 'Ranger'.
  6. Click SAVE.

Updating Custom Repo Name in Databricks

There are two ways to include the custom repository name. You can choose either of the following methods:

  1. Manually update the ranger_enable.sh (init script):

    • Open the ranger_enable.sh script.
    • Update below property with the prefix you used for creating the new service repo. E.g. dev. By default, it is privacera
      Bash
      export SERVICE_NAME_PREFIX=dev
      
    • Save the file and use it in the Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.
  2. Update the vars.databricks.plugin.yml file:

    • SSH to the instance where Privacera Manager is installed.
    • Run the following command to navigate to the /custom-vars directory.
      Bash
      cd ~/privacera/privacera-manager/config/custom-vars
      
    • Open the vars.databricks.plugin.yml file.
      Bash
      vi vars.databricks.plugin.yml
      
    • Uncomment the DATABRICKS_SERVICE_NAME_PREFIX property and update it with your custom service name prefix.
      Bash
      DATABRICKS_SERVICE_NAME_PREFIX: "dev"
      
    • Once the property is configured, run the following commands to generate and upload the configuration
      Bash
      1
      2
      3
      cd ~/privacera/privacera-manager
      
      ./privacera-manager.sh post-install
      
    • Use the updated ranger_enable.sh script in the Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.

There are three ways to include the custom repository name. You can choose any one of the following methods:

  1. Update the privacera_databricks.sh (init script):

    • Open the privacera_databricks.sh script.
    • Add the following line after API_SERVER_URL="https://xxxxxxxx/api" to include the custom repository name:
      Bash
      export SERVICE_NAME_PREFIX=dev
      
    • Save the file and use it in Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.
  2. Set an Environment Variable at the Databricks Cluster Level:

    • Log in to the Databricks workspace.
    • Navigate to the cluster configuration.
    • Click on Edit -> Advanced options.
    • Click on the Spark tab and add the following property in Environment variables:
      Bash
      SERVICE_NAME_PREFIX=dev
      
    • Save and click on Start or, if the cluster is running, click on Restart.
  3. Set an Environment Variable in the Databricks Cluster Policy:

    • Create or update an existing Databricks cluster policy using the following json block:
      JSON
      1
      2
      3
      4
      "spark_env_vars.SERVICE_NAME_PREFIX": {
      "type": "fixed",
      "value": "dev"
      }
      
    • Create or update a cluster with the above policy to set the environment variable on the cluster.
    • Set the Spark configuration as done in step 2.
    • Save and click on Start or, if the cluster is running, click on Confirm and Restart.

!!! note "When the custom service repo is not defined using any of these methods, the plugin will by default use the service repos starting with “privacera".

Validation/Verification

To confirm the successful association of the custom S3 service repo, perform the following steps. The steps are similar for other services like Hive, Files, Adls, etc.:

  1. Prerequisites:
    • A running Databricks cluster secured using the above steps.
  2. Steps to Validate:
    • Login to Databricks.
    • Create or open an existing notebook. Associate the Notebook with the running Databricks cluster.
    • Use the following PySpark commands to verify read access to an S3 CSV file.
      Python
      1
      2
      3
      4
      5
      6
      7
      8
      # Define the S3 path to your file
      s3_path = "s3a://your-bucket-name/path/to/your/file"
      
      # Read the CSV file from the specified S3 path
      df = spark.read.format("csv").option("header", "true").load(s3_path)
      
      # Display the first 5 rows of the dataframe
      df.show(5)
      
    • On the Privacera portal, go to Access Management -> Audits
    • Check for the Service Name that you mentioned when Creating a Service repo, e.g., dev_s3.
    • Check for the success or failure of the resource policy. A successful access is indicated as Allowed and a failure is indicated as Denied.

Fallback to Default Service-Def

After using the custom service, you might need to revert to the default service definition. Follow these steps:

  1. Manually update the ranger_enable.sh (init script):

    • Open the ranger_enable.sh script.
    • Update the property with the default prefix:
      Bash
      export SERVICE_NAME_PREFIX=privacera
      
    • Save the file and use it in Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.
  2. Update the vars.databricks.plugin.yml file:

    • SSH to the instance where Privacera Manager is installed.
    • Run the following command to navigate to the /custom-vars directory:
      Bash
      cd ~/privacera/privacera-manager/config/custom-vars
      
    • Open the vars.databricks.plugin.yml file:
      Bash
      vi vars.databricks.plugin.yml
      
    • Comment out the DATABRICKS_SERVICE_NAME_PREFIX property:
      Bash
      DATABRICKS_SERVICE_NAME_PREFIX: "dev"
      
    • Once the property is configured, run the following commands to generate and upload the configuration:
      Bash
      1
      2
      3
      cd ~/privacera/privacera-manager
      
      ./privacera-manager.sh post-install
      
    • Use the updated ranger_enable.sh script in Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.
  1. Update the privacera_databricks.sh (init script):

    • Open the privacera_databricks.sh script.
    • Remove the below property:
      Bash
      export SERVICE_NAME_PREFIX=dev
      
    • Save the file and use it in Databricks cluster creation.
    • Click on Start or, if the cluster is running, click on Confirm and Restart.
  2. Remove an Environment Variable at the Databricks Cluster Level:

    • Login to Databricks workspace.
    • Navigate to the cluster configuration.
    • Click on Edit -> Advanced options.
    • Click on the Spark tab and remove the following property in Environment variables:
      Bash
      SERVICE_NAME_PREFIX=dev
      
    • Save and click on Start or, if the cluster is running, click on Restart.
  3. Remove an Environment Variable at the Databricks Cluster Policy:

    • Update an existing Databricks cluster policy by removing the below JSON block:
      JSON
      1
      2
      3
      4
      "spark_env_vars.SERVICE_NAME_PREFIX": {
      "type": "fixed",
      "value": "dev"
      }
      
    • Create or update a cluster with the above policy.
    • Remove the environment variable at the Databricks cluster level as done in step 2.
    • Save and click on Start or, if the cluster is running, click on Confirm and Restart.

Use Service Principal id for Authorization

By Default Privacera use display name for Service Principal, if you want to use Service Principal Id then perform following steps:

  1. Login to Databricks workspace.
  2. In the left-hand sidebar, click on Compute.
  3. Choose the cluster where you want to configure the Service Principal Id.
  4. Click on Edit -> Advanced options.
  5. Click on the Spark tab.
  6. Add below property in Spark config
    Bash
    spark.hadoop.privacera.fgac.use.displayname false
    
  7. Click on Confirm.
  8. Click on Start, or if the cluster is running, click on Restart.

Whitelist py4j Security Manager via S3 or DBFS

To uphold security measures, certain Python methods are blacklisted by Databricks. However, Privacera makes employs these methods. If you wish to access these classes or methods, you may add them to a whitelisting file.

  1. Create the whitelisting.txt File:

    • This file should contain a list of packages, class constructors, or methods that you intend to whitelist.

    • Example:

      Python
      1
      2
      3
      4
      5
      6
      7
      8
      9
      # Whitelist an entire package (including all its classes) 
      org.apache.spark.api.python.*
      
      # Whitelist specific constructors
      org.apache.spark.api.python.PythonRDD
      
      # Whitelist specific methods
      org.apache.spark.api.python.PythonRDD.runJobToPythonFile
      org.apache.spark.api.python.SerDeUtil.pythonToJava
      

  2. Upload the whitelisting.txt File:

    • To DBFS, run the following command:

      Text Only
      dbfs cp whitelist.txt dbfs:/privacera/whitelist.txt
      

    • To S3, use the S3 console to upload the file to the desired location.

  3. Update Databricks Spark Configuration:

    • In Databricks, navigate to the Spark Configuration and specify the location of the whitelisting file:

    • For DBFS:

      Text Only
      spark.hadoop.privacera.whitelist dbfs:/privacera/whitelist.txt
      

    • For S3:

      Text Only
      spark.hadoop.privacera.whitelist s3://your-bucket/whitelist.txt
      

  4. Restart Your Cluster:

    • After making these changes, please restart your Databricks cluster for the new whitelist to take effect.

Whitelisting alters Databricks' default security. Ensure this is aligned with your security policies.

Managing Init Scripts manually

If the flag DATABRICKS_INIT_SCRIPT_WORKSPACE_FLAG_ENABLE is set to false, then you need to manually upload the init script to your Databricks Workspace.

  1. Copy the initialization script from the Privacera SM host location ~/privacera/privacera-manager/output/databricks/ranger_enable.sh to your local machine.
  2. Log in to your Databricks account and click on the "Workspace" tab located on the left-hand side of the screen.
  3. Click on the "Workspace" folder. Here, create a new folder named "privacera."
  4. Within the "privacera" folder, create another folder named DEPLOYMENT_ENV_NAME, replacing DEPLOYMENT_ENV_NAME with the corresponding value specified in vars.privacera.yml.
  5. Enter the DEPLOYMENT_ENV_NAME folder. Right-click on the screen and select "Import." Choose to upload the initialization script ranger_enable.sh.
  6. Once the script is uploaded, it will appear in the DEPLOYMENT_ENV_NAME folder under the "privacera" folder.

Setting Up Multiple Databricks Workspaces

To set up multiple Databricks Workspaces, perform the following steps:

  1. SSH to the instance where Privacera Manager is installed.
  2. Open the vars.databricks.plugin.yml file:
    Bash
    cd ~/privacera/privacera-manager/config/custom-vars
    vi vars.databricks.plugin.yml
    
  3. Add or update the following properties in the file, ensuring that the databricks_host_url and token values are updated accordingly for each workspace:
    YAML
    #Update databricks url and token
    DATABRICKS_HOST_URL: "https://<workspace>.cloud.databricks.com"
    DATABRICKS_TOKEN: "<workspace_token>"
    
    #Add your new workspace with example below
    #databricks_host_url, token will be set by the above parameters DATABRICKS_HOST_URL, DATABRICKS_TOKEN.
    DATABRICKS_WORKSPACES_LIST:
      - alias: "DEFAULT"
        databricks_host_url: "{{DATABRICKS_HOST_URL}}"
        token: "{{DATABRICKS_TOKEN}}"
    
      - alias: "WORKSPACE1"
        databricks_host_url: "https://<workspace1>.cloud.databricks.com"
        token: "<workspace1_token>"
    
      - alias: "WORKSPACE2"
        databricks_host_url: "https://<workspace2>.cloud.databricks.com"
        token: "<workspace2_token>"
    
  4. Once the properties are configured, run the following commands to generate and upload the configuration:
    Bash
    1
    2
    3
    cd ~/privacera/privacera-manager
    
    ./privacera-manager.sh post-install
    
  5. Use the updated ranger_enable.sh script in Databricks cluster creation.
  6. Click on Start or, if the cluster is running, click on Confirm and Restart.
Feature Description Default Value Possible Values
spark.hadoop.privacera.custom.current_user.udf.names Map logged-in user to Ranger user for row-filter policy. Valid function name however you have to make sure it should be in sync with row-filter current_user condition. current_user()
spark.hadoop.privacera.spark.view.levelmaskingrowfilter.extension.enable To enable View Level Access Control (Using Data_admin feature), View Level Column Masking, and View Level Row Filtering. false true/false
spark.hadoop.privacera.spark.rowfilter.extension.enable To enable/disable Row Filtering on table. true true/false
spark.hadoop.privacera.spark.masking.extension.enable To enable/disable Column Masking on table. true true/false
privacera.fgac.file.ignore.path Comma separated list of paths that are ignored during access check only for the file:/ protocol. /tmp/tmp/* /tmp/tmp*/tmp, /tmp/data1

Comments