PrivaceraCloud Documentation

Applications
:
About applications

This section contains how to connect, edit, and delete applications.

Terminology

A datasource is a collection of data stored in a third-party application such as Microsoft SQL, AWS S3, or Databricks. PrivaceraCloud integrates with your datasource to control access or scan for sensitive data.

Datasources are organized into applications.

An application is a configuration for a data resource or authentication resource to be linked to your PrivaceraCloud account.

You provide application target and type-specific properties for the target resource such as location address (URL) and authentication credentials to that resource.

For some applications, you can add custom properties using a key/value pair syntax.

The properties can be exported to a JSON properties file. This file can then be reimported at a later time or can be used as a template for other applications.

An authentication resource can be a connection to directory service for data access users or for portal users.

Note

You can only use one dataserver setup per account for Privacera Access Management.

Connect an application
  1. Go to Settings > Applications.

  2. In the Applications section, select the application you wish to connect.

  3. Enter the application Name and Description.

  4. Click SAVE to save the changes or CANCEL to discard them.

View connection status

To view the status of the connection to an application:

  1. Go to Settings > Applications.

  2. Click the name of the previously connected application.

  3. Look under the Access Management column:

    • Red: there is a problem with the connection.

    • Green: Successful completion of the connection.

Edit application name and description
  1. Go to Settings > Applications .

  2. Select the application you wish to edit.

  3. Under the Action column, click the pen icon.

  4. Change the Name or Description.

  5. Click SAVE to save the changes or CANCEL to discard them.

Delete application
  1. Go to Settings > Applications.

  2. Select the application you wish to delete.

  3. Under the Action column, click the trash can icon.

  4. Carefully read the warning in the popup.

  5. To verify that you want to delete the application, type delete in the text box.

  6. Click DELETE to delete the application or CANCEL to discard the changes.

Azure Data Lake Storage Gen 2 (ADLS)

This topic describes how to connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud.

Prerequisites

Before connecting the Azure Data Lake Storage Gen 2 (ADLS) application, make sure you have the following information available:

  • Azure Data Lake Storage Gen 2 (ADLS) Storage Account ID

  • Azure Data Lake Storage Gen 2 (ADLS) Account Storage Key

Note

You can only use one Azure Data Lake Storage Gen 2 (ADLS) setup per PrivaceraCloud account for Privacera Access Management.
Connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud

To connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud:

  1. Go the Settings > Applications.

  2. In the Applications screen, select Azure Data Lake Storage Gen 2 (ADLS).

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle to enable Access Management for Azure Data Lake Storage Gen 2 (ADLS).

  5. On the BASIC tab, enter the values in the following fields:

    • Azure Data Lake Storage Gen 2 (ADLS) Storage Account

    • Azure Data Lake Storage Gen 2 (ADLS) Storage Key

  6. In the ADVANCED tab, you can add custom properties.

  7. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  8. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

After the service is established, you can configure your local Azure CLI to redirect requests to the PrivaceraCloud Azure ADLS Data Server proxy. For more information, see Scripts for AWS CLI or Azure CLI for managing connected applications.

Athena

This topic describes how to connect Athena to PrivaceraCloud.

Prerequisites in AWS console

Before connecting Athena to PrivaceraCloud for Privacera Access Management, make sure that only one Privacera dataserver.

In your AWS console:

  1. Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy.

  2. Configure a trust relationship with PrivaceraCloud. See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.

  3. Save the ARN, which you need to set in PrivaceraCloud in the following steps.

To verify the connection of Athena, Privacera recommends that you install the AWS CLI. Install and configure the AWS CLI on your sytem so that it uses the PrivaceraCloud S3 Data Server proxy.

Connect Athena with IAM role and trust relationship
  1. Go to Setting > Applications.

  2. Select Athena.

  3. Enter the application Name and Description.

  4. Click Save.

  5. Click the toggle to enable Access Management for the application.

    On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key

      2. AWS Account Secret Key: AWS data repository host account Secret Key

      3. AWS_ATHENA_RESULT_STORAGE_URL: Query results storage bucket URL

      4. Click Save.

    • With Use IAM Role enabled, enter values for the following fields:

      1. AWS IAM Role

      2. AWS IAM Role External Id

      3. AWS_ATHENA_RESULT_STORAGE_URL: Query results storage bucket URL

      4. Click Save.

  6. In the ADVANCED tab, you can add custom properties.

  7. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  8. Recommended: Validate connectivity by running the AWS CLI for Athena queries such as the following:

    aws athena start-query-execution --query-string "SHOW DATABASES"
    
Privacera Discovery with Cassandra

This topic describes how to connect Cassandra to the PrivaceraCloud Discovery service.

Prerequisites

Before connecting the Cassandra application, make sure you have the following information available:

  • JDBC URL

  • JDBC Username 

  • JDBC Password 

Connect application
  1. Go to Settings > Applications.

  2. Select Cassandra.

  3. Enter the application Name and Description.

  4. Click Save.

  5. Click the toggle button to enable Data Discovery for Cassandra.

  6. In the BASIC tab, enter the values in the following fields:

    • JDBC URL

    • JDBC Username 

    • JDBC Password 

  7. On the ADVANCED tab, you can add custom properties.

  8. Click IMPORT PROPERTIES to browse and import application properties.

  9. Click TEST CONNECTION to check if the connection is successful.

  10. Click Save.

Define scan targets

To define Privacera Discovery scan targets for this application, see Privacera Discovery scan targets.

Databricks

The topic describes how to connect Databricks application to PrivaceraCloud using AWS and Azure platforms. Privacera provides Spark Fine-Grained Access Control plug-in [FGAC] and Spark Object-Level Access Control plug-in [OLAC] plugin solutions for access control in Databricks clusters. Both plugins are mutually exclusive and cannot be enabled on the same cluster.

  1. Go the Setting > Applications.

  2. In the Applications screen, select Databricks.

  3. Select the platform type (AWSor Azure) on which you want to configure the Databricks application.

  4. Enter the application Name and Description, and then click Save.

  5. Click the toggle button to enable Access Management for Databricks.

Databricks Spark Fine-Grained Access Control plug-in [FGAC]

PrivaceraCloud integrates with Databricks SQL using the Plug-In integration method with an account-specific cluster-scoped initialization script. Privacera’s Spark plug-In will be installed on the Databricks cluster enabling Fine-Grained Access Control. This script will be added it to your cluster as an init script to run at cluster startup. As your cluster is restarted, it runs the init script and connects to PrivaceraCloud.

Note

Accounts upgrading from PrivaceraCloud 2.0 to PrivaceraCloud 2.1 and intending to use Privacera Encryption with Databricks must re-install the init script to Databricks.

Prerequisites

Ensure that the following prerequisites are met:

  • You must have an existing Databricks account and login credentials with sufficient privileges to manage your Databricks cluster.

  • PrivaceraCloud portal admin user access.

This setup is recommended for SQL, Python, and R language notebooks.

  • It provides FGAC on databases with row filtering and column masking features.

  • It uses privacera_hive, privacera_s3, privacera_adls, privacera_files services for resource-based access control, and privacera_tag service for tag-based access control.

  • It uses the plugin implementation from Privacera.

Steps
  1. Log in to the PrivaceraCloud portal as an admin user (role ROLE_ACCOUNT_ADMIN).

  2. Generate the new API and Init Script. For more information, see API Key.

  3. On the Databricks Init Script section, click DOWNLOAD SCRIPT.

    By default, this script is named privacera_databricks.sh. Save it to a local filesystem or shared storage.

  4. Log in to your Databricks account using credentials with sufficient account management privileges.

  5. Copy the Init script to your Databricks cluster. This can be done via the UI or using the Databricks CLI.

    1. Using the Databricks UI:

      1. On the left navigation, click the Data icon.

      2. Click the Add Data button from the upper right corner.

      3. In the Create New Table dialog, select Upload File, and then click browse.

      4. Select privacera_databricks.sh, and then click Open to upload it.

        Once the file is uploaded, the dialog will display the uploaded file path. This filepath will be required in the later step.

        The file will be uploaded to /FileStore/tables/privacera_databricks.sh path, or similar.

    2. Using the Databricks CLI, copy the script to a location in DBFS:

      databricks fs cp ~/<sourcepath_privacera_databricks.sh> dbfs:/<destinaton_path>
                  

      For example:

      databricks fs cp ~/Downloads/privacera_databricks.sh dbfs:/FileStore/tables/
                  
  6. You can add PrivaceraCloud to an existing cluster, or create a new cluster and attach PrivaceraCloud to that cluster.

    a. In the Databricks navigation panel select Clusters.

    b. Choose a cluster name from the list provided and click Edit to open the configuration dialog page.

    c. Open Advanced Options and select the Init Scripts tab.

    d. Enter the DBFS init script path name you copied earlier.

    e. Click Add.

    f. From Advanced Options, select the Spark tab. Add the following Spark configuration content to the Spark Config edit window. For more information on the properties, see Spark Configuration Table Properties.

    New Properties:

    spark.databricks.isv.productprivacera
    spark.databricks.cluster.profileserverless
    spark.databricks.delta.formatCheck.enabledfalse
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar 
    spark.databricks.repl.allowedLanguagessql,python,r    

    Old Properties:

    spark.databricks.isv.productprivacera
    spark.databricks.cluster.profileserverless
    spark.databricks.delta.formatCheck.enabledfalse
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
    spark.databricks.repl.allowedLanguagessql,python,r
            

    Note

    • From PrivaceraCloud release 4.1.0.1 and later, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.

    • For Databricks versions <=8.2, Old Properties should only be used since the versions are in extended support.

    • If you are upgrading the Databricks Runtime from an existing version (6.4-8.2) to a version 8.3 and higher, contact Privacera technical sales representative for assistance.

  7. Restart the Databricks cluster.

Notice

To enable View Level Access Control, View Level Column Masking, and View Level Row Filtering, refer to ??? By default these features are disabled.

Validate installation

Confirm connectivity by executing a simple data access sequence and then examining the PrivaceraCloud audit stream.

You will see corresponding events in the Access Manager > Audits.

Example data access sequence:

  1. Create or open an existing Notebook. Associate the Notebook with the Databricks cluster you secured in the steps above.

  2. Run an SQL show tables command in the Notebook:

    sql
    
    show tables ;
  3. On PrivaceraCloud, go to Access Manager > Audits to view the monitored data access.

    image28.png
  4. Create a Deny policy, run this same SQL access sequence a second time, and confirm corresponding Denied events.

Databricks Spark Object-Level Access Control plug-in [OLAC]

This section outlines the steps needed to setup Object-Level Access Control (OLAC) in Databricks clusters. This setup is recommended for Scala language notebooks.

  • It provides OLAC on S3 locations accessed via Spark.

  • It uses privacera_s3 service for resource-based access control and privacera_tag service for tag-based access control.

  • It uses the signed-authorization implementation from Privacera.

    Note

    • If you are using SQL, Python, and R language notebooks, recommendation is to use FGAC. See the Databricks Spark Fine-Grained Access Control plug-in [FGAC] section above.

    • OLAC and FGAC methods are mutually exclusive and cannot be enabled on the same cluster.

    • OLAC plugin was introduced to provide an alternative solution for Scala language clusters, since using Scala language on Databricks Spark has some security concerns.

Prerequisites

Ensure that the following prerequisites are met:

  • You must have an existing Databricks account and login credentials with sufficient privileges to manage your Databricks cluster.

  • PrivaceraCloud portal admin user access.

Steps

Note

For working with Delta format files, configure the AWS S3 application using IAM role permissions.

  1. Create a new AWS S3 Databricks connection. For more information, see Create S3 application.

    After creating an S3 application.

    1. In the BASIC tab, provide Access Key, Secret Key, or an IAM Role. For more information, see Create S3 application.

    2. In the ADVANCED tab, add the following property:

      dataserver.databricks.allowed.urls=<DATABRICKS_URL_LIST>

      where <DATABRICKS_URL_LIST>: Comma-separated list of the target Databricks cluster URLs.

      For example:

      dataserver.databricks.allowed.urls=https://dbc-yyyyyyyy-xxxx.cloud.databricks.com/.

    3. Click Save.

  2. If you are updating an S3 application:

    1. Go to Settings > Applications > S3, and click the pen icon to edit properties.

    2. Click the toggle button of a service you wish to enable.

    3. In the ADVANCED tab, add the following property:

      dataserver.databricks.allowed.urls=<DATABRICKS_URL_LIST>

      where <DATABRICKS_URL_LIST>: Comma-separated list of the target Databricks cluster URLs. For example,

      dataserver.databricks.allowed.urls=https://dbc-yyyyyyyy-xxxx.cloud.databricks.com/.

    4. Save your configuration.

  3. Download the Databricks init script.

    1. Log in to the PrivaceraCloud portal.

    2. Generate the new API and Init Script. For more information, refer to the topic API Key.

    3. On the Databricks Init Script section, click the DOWNLOAD SCRIPT button.

      By default, this script is named privacera_databricks.sh. Save it to a local filesystem or shared storage.

  4. Upload the Databricks init script to your Databricks clusters.

    1. Log in to your Databricks cluster using administrator privileges.

    2. On the left navigation, click the Data icon.

    3. Click Add Data from the upper right corner.

    4. From the Create New Table dialog box select Upload File, then select and open privacera_databricks.sh.

    5. Copy the full storage path onto your clipboard.

  5. Add the Databricks init script to your target Databricks clusters:

    1. In the Databricks navigation panel select Clusters.

    2. Choose a cluster name from the list provided and click Edit to open the configuration dialog page.

    3. Open Advanced Options and select the Init Scripts tab.

    4. Enter the DBFS init script path name you copied earlier.

    5. Click Add.

    6. From Advanced Options, select the Spark tab. Add the following Spark configuration content to the Spark Config edit window. For more information on the properties, see Spark Configuration Table Properties.

      New Properties

      spark.databricks.isv.productprivacera
      spark.databricks.repl.allowedLanguagessql,python,r,scala
      spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar
      spark.executor.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar
      spark.databricks.delta.formatCheck.enabledfalse

      Add the following property in the Environment Variables text box:

      PRIVACERA_PLUGIN_TYPE=OLAC

      Old Properties

      spark.databricks.isv.product privacera
      spark.databricks.repl.allowedLanguagessql,python,r,scala
      spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
      spark.hadoop.fs.s3.implcom.databricks.s3a.PrivaceraDatabricksS3AFileSystem
      spark.hadoop.fs.s3n.implcom.databricks.s3a.PrivaceraDatabricksS3AFileSystem
      spark.hadoop.fs.s3a.implcom.databricks.s3a.PrivaceraDatabricksS3AFileSystem
      spark.executor.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
      spark.hadoop.signed.url.enable true
    7. Save and close.

    8. Restart the DatabricksCluster.

    Note

    • From PrivaceraCloud release 4.1.0.1 onwards, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.

    • For Databricks versions <= 8.2, Old Properties should only be used since the versions are in extended support.

    • If you are upgrading the Databricks Runtime from an existing version (6.4-8.2) to a version 8.3 and higher, contact Privacera technical sales representative for assistance.

Your S3 Databricks cluster data resource is now available for Access Manager Policy Management, under Access Manager > Resource Policies, Service "privacera_s3".

Databricks cluster deployment matrix with Privacera plugin:

Job/Workflow use-case for automated cluster:

Run-Now will create the new cluster based on the definition mentioned in the job description.

Table 2. 

Job Type  

Languages

FGAC/DBX version

OLAC/DBX Version

Notebook

Python/R/SQL

Supported [7.3, 9.1 , 10.4]

JAR

Java/Scala

Not supported

Supported[7.3, 9.1 , 10.4]

spark-submit

Java/Scala/Python

Not supported

Supported[7.3, 9.1 , 10.4]

Python

Python

Supported [7.3, 9.1 , 10.4]

Python wheel

Python

Supported [9.1 , 10.4]

Delta Live Tables pipeline

Not supported

Not supported



Job on existing cluster:

Run-Now will use the existing cluster which is mentioned in the job description.

Table 3. 

Job Type

Languages

FGAC/DBX version

OLAC

Notebook

Python/R/SQL

supported [7.3, 9.1 , 10.4]

Not supported

JAR

Java/Scala

Not supported

Not supported

spark-submit

Java/Scala/Python

Not supported

Not supported

Python

Python

Not supported

Not supported

Python wheel

Python

supported [9.1 , 10.4]

Not supported

Delta Live Tables pipeline

Not supported

Not supported



Interactive use-case

Interactive use-case is running a notebook of SQL/Python on an interactive cluster.

Table 4. 

Cluster Type

Languages

FGAC

OLAC

Standard clusters

Scala/Python/R/SQL

Not supported

Supported [7.3,9.1,10.4]

High Concurrency clusters

Python/R/SQL

Supported [7.3,9.1,10.4

Supported [7.3,9.1,10.4]

Single Node

Scala/Python/R/SQL

Not supported

Supported [7.3,9.1,10.4]



Access AWS S3 using Boto3 from Databricks

This section describes how to use the AWS SDK (Boto3) for PrivaceraCloud to access AWS S3 file data through a Privacera DataServer proxy.

The following commands must be run in a notebook for Databricks:

  1. Install the AWS Boto3 libraries

    pip install boto3
  2. Import the required libraries

    import boto3
  3. Access the AWS S3 files

    def check_s3_file_exists(bucket, key, access_key, secret_key, endpoint_url, dataserver_cert, region_name):
      exec_status = False
      access_key = access_key
      secret_key = secret_key
      endpoint_url = endpoint_url
      try:
        s3 = boto3.resource(service_name='s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key, endpoint_url=endpoint_url, region_name=region_name)
        print(s3.Object(bucket_name=bucket, key=key).get()['Body'].read().decode('utf-8'))
        exec_status = True
      except Exception as e:
        print("Got error: {}".format(e))
      finally:
        return exec_status  
      
    def read_s3_file(bucket, key, access_key, secret_key, endpoint_url, dataserver_cert, region_name):
      exec_status = False
      access_key = access_key
      secret_key = secret_key
      endpoint_url = endpoint_url
      try:
        s3 = boto3.client(service_name='s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key, endpoint_url=endpoint_url, region_name=region_name)
        obj = s3.get_object(Bucket=bucket, Key=key)
        print(obj['Body'].read().decode('utf-8'))
        exec_status = True
      except Exception as e:
        print("Got error: {}".format(e))
      finally:
        return exec_status
      
    readFilePath = "file data/data/format=txt/sample/sample_small.txt"
    bucket = "infraqa-test"
    #saas
    access_key = "${privacera_access_key}"
    secret_key = "${privacera_secret_key}"
    endpoint_url = "https://ds.privaceracloud.com"
    dataserver_cert = ""
    region_name = "us-east-1"
    print(f"got file===== {readFilePath} ============= bucket= {bucket}")
    status = check_s3_file_exists(bucket, readFilePath, access_key, secret_key, endpoint_url, dataserver_cert, region_name)
    
    
Access Azure file using Azure SDK from Databricks

This section describes how to use the Azure SDK for PrivaceraCloud to access Azure DataStorage/Datalake file data through a Privacera DataServer proxy.

The following commands must be run in a notebook for Databricks:

  1. Install the Azure SDK libraries

    pip install azure-storage-file-datalake
  2. Import the required libraries

    import os, uuid, sys
    from azure.storage.filedatalake import DataLakeServiceClient
    from azure.core._match_conditions import MatchConditions
    from azure.storage.filedatalake._models import ContentSettings
  3. Initialize the account storage through connection string method

    def initialize_storage_account_connect_str(my_connection_string):
        
        try:  
            global service_client
            print(my_connection_string)
       
            service_client = DataLakeServiceClient.from_connection_string(conn_str=my_connection_string, headers={'x-ms-version': '2020-02-10'})
        
        except Exception as e:
            print(e)
  4. Prepare the connection string

    def prepare_connect_str():
        try:
            
            connect_str = "DefaultEndpointsProtocol=https;AccountName=${privacera_access_key}-{storage_account_name};AccountKey=${base64_encoded_value_of(privacera_access_key|privacera_secret_key)};BlobEndpoint=https://ds.privaceracloud.com;"
            
           # sample value is shown below
           #connect_str = "DefaultEndpointsProtocol=https;AccountName=MMTTU5Njg4Njk0MDAwA6amFpLnBhdGVsOjE6MTY1MTU5Njg4Njk0MDAw==-pqadatastorage;AccountKey=TVRVNUTU5Njg4Njk0MDAwTURBd01UQTZhbUZwTG5CaGRHVnNPakU2TVRZMU1URTJOVGcyTnpVMTU5Njg4Njk0MDAwVZwLzNFbXBCVEZOQWpkRUNxNmpYcjTU5Njg4Njk0MDAwR3Q4N29UNFFmZWpMOTlBN1M4RkIrSjdzSE5IMFZic0phUUcyVHTU5Njg4Njk0MDAwUxnPT0=;BlobEndpoint=https://ds.privaceracloud.com;"
    
            return connect_str
        except Exception as e:
          print(e)
  5. Define a sample access method to get Azure file and directories

    def list_directory_contents(connect_str):
        try:
            initialize_storage_account_connect_str(connect_str)
            
            file_system_client = service_client.get_file_system_client(file_system="{storage_container_name}")
            #sample values as shown below
            #file_system_client = service_client.get_file_system_client(file_system="infraqa-test")
    
            paths = file_system_client.get_paths(path="{directory_path}")
            #sample values as shown below
            #paths = file_system_client.get_paths(path="file data/data/format=csv/sample/")
    
            for path in paths:
                print(path.name + '\n')
    
        except Exception as e:
          print(e)
  6. To verify that the proxy is functioning, call the access methods

    connect_str = prepare_connect_str()
    list_directory_contents(connect_str)
Databricks SQL

Databricks SQL Overview and Configuration

One purpose of PolicySync for Databricks SQL is to limit users access to your entire Databricks data source or portions thereof, such as Delta external tables, views, entire tables, or only certain columns or rows.

Planning and general process

The general process for connecting with JDBC to a Databricks SQL data source, creating policy, and limiting user access is as follows, You should plan to have the necessary information before you begin the specific steps described here.

  1. Add the privacera_tag service.

  2. Create an endpoint in Databricks SQL for PrivaceraCloud to connect to, with JDBC username, password, and URL.

  3. Add Databricks SQL as a service in PrivaceraCloud.

  4. Define a data source for the Databricks SQL endpoint in PrivaceraCloud using the values from the first step and other required fields.

  5. Define the Databricks SQL service.

  6. Determine the users, groups, or roles who need access from PrivaceraCloud to your Databricks SQL.

    1. Ensure that all users in PrivaceraCloud who will access Databricks SQL have an email address in their PrivaceraCloud account.

    2. Define those users with appropriate permissions in Databricks.

    3. Create a resource policy to assign users, groups, or roles the necessary permissions to access the Databricks SQL data source at the appropriate depth.

    4. Decide the depth of the data access you will give to users: views, source tables, columns, or rows. See Allowable Privileges.

Prerequisites

Make sure the Privacera Tag Service and Databricks SQL Endpoint configuration are updated before you configure Databricks SQL PolicySync.

Enable PrivaceraCloud tag service

In PrivaceraCloud, the administrator must add the privacera_tag service to enable PolicySync with Databricks SQL.

See the steps in Adding the privacera_tag Service.

Create endpoint in Databricks SQL

In Databricks SQL, an administrator must create a Databricks SQL endpoint for connecting from PrivaceraCloud. This process is described in Create an Endpoint in Databricks SQL.

Make note of the following values for entering into the fields in PrivaceraCloud as detailed in Connect Application and Databricks SQL PolicySync Fields:

  • The email address of the user defined in the endpoint. This is the value of the JDBC username (Service jdbc username) in PrivaceraCloud.

  • The Databricks generated access token. This is the value of the JDBC password (Service jdbc password) for the defined JDBC username in PrivaceraCloud.

  • The JDBC URL (Service jdbc url) defined for the endpoint.

Databricks SQL with Privacera Hive

To use Databricks SQL with Privacera Hive, see Databricks SQL Hive Service Def.

Connect application

With the values for the JDBC username, JDBC password, and JDBC URL that you noted in Create endpoint in Databricks SQL, define the data source connection in PrivaceraCloud to the Databricks SQL endpoint.

Follow these steps to connect the Databricks SQL application to the PrivaceraCloud:

  1. Go the Setting > Applications.

  2. In the Applications screen, select Databricks SQL.

  3. Select the platform type (AWS or Azure) on which you want to configure the Databricks application.

  4. Enter the application Name and Description, and then click Save.

  5. Click the toggle button either to enable the Access Management or Data Discovery for Databricks SQL.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery.

  6. In the BASIC tab, enter values in the fields. For more information on the Fields and it's values, see Databricks SQL PolicySync Fields.

  7. Click Save.

  8. In the ADVANCED tab, you can add custom properties.

  9. Using the IMPORT PROPERTIES button, you can browse and import application properties.

Grant Databricks SQL permissions to PrivaceraCloud users

For each PrivaceraCloud user that needs access to Databricks SQL, the administrator needs to define that user with appropriate access permissions in Databricks.

Ensure all PrivaceraCloud users have an email address

All PrivaceraCloud users who will access Databricks SQL must have an email address in their user account on PrivaceraCloud. This email address is required to login to Databricks SQL.

Grant Databricks SQL access

In your Databricks account:

  1. Navigate to Data science and engineering.

  2. Click Workspace on the top right.

  3. To open the Admin Console, go to the top right of the Workspace, click the user account icon, and select Admin Console.

  4. In the Databricks SQL access column, select the checkbox for the user.

Grant Databricks SQL endpoint access

In the Databricks SQL Dashboard:

  1. Navigate to SQL > Endpoints

  2. Click the name of the Endpoint for which you want to add user permission.

  3. In the top right, click Permissions.

  4. In the SQL Endpoint Permissions dialog, select the intended user from drop down

  5. Give the user Can Use permission.

  6. Click Add.

  7. Click Save.

Define a resource policy

In PrivaceraCloud, define a resource policy to grant access to the Databricks SQL data source to users, groups, or roles.

Follow the steps in Resource Policies and the details about allowed privileges described here.

Allowable privileges

The following privileges can be specified for a Databricks SQL resource policy:

  • SELECT: Allows read access to an object.

  • CREATE: Provides ability to create an object (for example, a table in a database).

  • MODIFY: Provides ability to add, delete, and modify data to or from an object.

  • USAGE: An additional requirement to perform any action on a database object.

  • READ_METADATA: Provides ability to view an object and its metadata.

  • CREATE_NAMED_FUNCTION: Provides ability to create a named UDF in an existing catalog or database.

  • ALL PRIVILEGES: Gives all privileges, equivalent to all the above privileges.

  • Data_Admin Privilege for Secure Views: With the Data_Admin privilege, access policies are applied to source tables. If you want to restrict the access policies only to the views and not to the source tables, enable the following property in the PolicySync configuration, as detailed in Connect Application and Databricks SQL PolicySync Fields:

    Secure view Access by Table policies: true

Test the policy

To assign privileges to users, groups, or roles, follow the steps in Resource Policies.

This can be tested with a non-administrator user.

Databricks SQL PolicySync fields

For a description of all fields that must or can be set for resource policy, see Databricks SQL PolicySync Fields.

Configuring column-level access control

To enable column-level access control, set the following fields when you define the PolicySync fields:

  • Column Level Access Control: true.

  • In custom fields, add the following, where # REDACTED # is any string of your choice:

    ranger.policysync.connector.4.access.control.number.value=0
    ranger.policysync.connector.4.access.control.double.value=0
    ranger.policysync.connector.4.access.control.text.value='# REDACTED #'      
View-based masking functions and row-level filtering

For supported masking functions and supported row-level filtering, see Databricks SQL Masking Functions.

Create an endpoint in Databricks SQL
  1. Login to your Databricks account as a user with administrative privileges.

  2. After logging into your Databricks, go to SQL Analytics.

  3. Go to Endpoints and click New SQL Endpoint.

  4. Create the endpoint as per your requirement as shown below.

  5. Click the endpoint connection details and note the JDBC URL for configuration with PolicySync.

  6. Click the personal access token to create token.

  7. Click Generate New Token.

  8. Enter the name of the token, specify its validity, and click Generate.

  9. Copy the generated token. This is the JDBC password of the user when connecting from PolicySync, and the email ID of the user is the JDBC username.

Databricks SQL Fields

Basic fields

Table 5. Basic fields

Field name

Type

Default

Required

Description

Databricks SQL jdbc url

string

Yes

Specifies the JDBC URL for the Databricks SQL connector.

Use the following format for the JDBC URL:

jdbc:spark://<WORKSPACE_URL>:443/<DATABASE>;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/endpoints/1234567890

The workspace URL and the database name are derived from your Databricks SQL configuration.

Databricks SQL jdbc username

string

Yes

Specifies the JDBC username to use.

Databricks SQL jdbc password

string

Yes

Specifies the access token of the SQL endpoint to use.

Databricks SQL default database

string

Yes

Specifies the name of the JDBC database to use.

Databricks SQL resource owner

string

No

Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.

  • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

  • If a value is specified, the owner of the resource will be changed to the specified value.

The following resource types are supported:

  • Database

  • Schemas

  • Tables

  • Views

Databricks SQL workspace URL

string

Yes

Specifies the base URL for the Databricks SQL instance.

Databases to set access control policies

string

No

Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of databases might resemble the following: testdb1,testdb2,sales db*.

If specified, Databases to ignore while setting access control policies takes precedence over this setting.

Enable policy enforcements and user/group/role management

boolean

true

Yes

Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

Enable access audits

boolean

true

Yes

Specifies whether Privacera fetches access audit data from the data source.



Advanced fields

Table 6. Advanced fields

Field name

Type

Default

Required

Description

Tables to set access control policies

string

No

Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

Use the following format when specifying a table:

<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>

If specified, Tables to ignore while setting access control policies takes precedence over this setting.

If you specify a wildcard, such as in the following example, all matched tables are managed:

<DATABASE_NAME>.<SCHEMA_NAME>.*

The specified value, if any, is interpreted in the following ways:

  • If unset, access control is managed for all tables.

  • If set to none no tables are managed.

Databases to ignore while setting access control policies

string

No

Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.

For example:

testdb1,testdb2,sales_db*

This setting supersedes any values specified by Databases to set access control policies.

Tables to ignore while setting access control policies

string

No

Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:

<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>

This setting supersedes any values specified by Tables to set access control policies.

Regex to find special characters in names

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a user name and replaces each matching character with the value specified by the **String to replace with the special characters found in names** setting.

If not specified, no find and replace operation is performed.

String to replace with the special characters found in names

string

_

No

Specifies a string to replace the characters matched by the regex specified by the **Regex to find special characters in names** setting.

If not specified, no find and replace operation is performed.

Regex to find special characters in user names

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.

If not specified, no find and replace operation is performed.

String to replace with the special characters found in user names

string

_

No

Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.

If not specified, no find and replace operation is performed.

Regex to find special characters in group names

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.

If not specified, no find and replace operation is performed.

String to replace with the special characters found in group names

string

_

No

Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.

If not specified, no find and replace operation is performed.

Regex to find special characters in role names

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.

If not specified, no find and replace operation is performed.

String to replace with the special characters found in role names

string

_

No

Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.

If not specified, no find and replace operation is performed.

Persist case sensitivity of user names

boolean

false

No

Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

Persist case sensitivity of group names

boolean

false

No

Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

Persist case sensitivity of role names

boolean

false

No

Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to true, case sensitivity is preserved.

Create users in Databricks SQL Endpoint by policysync

boolean

true

No

Specifies whether PolicySync creates local users for each user in Privacera.

Manage users from portal

boolean

true

No

Specifies whether PolicySync maintains user membership in roles in the Databricks SQL data source.

Manage groups from portal

boolean

true

No

Specifies whether PolicySync creates groups from Privacera in the Databricks SQL data source.

Manage roles from portal

boolean

true

No

Specifies whether PolicySync creates roles from Privacera in the Databricks SQL data source.

Users to set access control policies

string

No

Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

If not specified, PolicySync manages access control for all users.

If specified, Users to be ignored by access control policies takes precedence over this setting.

An example user list might resemble the following: user1,user2,dev_user*.

Groups to set access control policies

string

No

Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: group1,group2,dev_group*.

If specified, Groups be ignored by access control policies takes precedence over this setting.

Roles to set access control policies

string

No

Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: role1,role2,dev_role*.

If specified, Roles be ignored by access control policies takes precedence over this setting.

Users to be ignored by access control policies

string

No

Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

This setting supersedes any values specified by Users to set access control policies.

Groups be ignored by access control policies

string

No

Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

This setting supersedes any values specified by Groups to set access control policies.

Roles be ignored by access control policies

string

No

Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.

This setting supersedes any values specified by Roles to set access control policies.

Prefix of Databricks SQL Endpoint roles for portal groups

string

priv_group_

No

Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

Prefix of Databricks SQL Endpoint roles for portal roles

string

priv_role_

No

Specifies the prefix that PolicySync uses when creating roles from Privacera in the Databricks SQL data source.

For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

Use Databricks SQL Endpoint native public group for public group access policies

boolean

true

No

Specifies whether PolicySync uses the Databricks SQL native public group for access grants whenever a policy refers to a public group. The default value is true.

Set access control policies only on the users from managed groups

boolean

false

No

Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.

Set access control policies only on the users/groups from managed roles

boolean

false

No

Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.

Use email as service name

boolean

true

No

This Property is used to map the username to the email address when granting/revoking access.

Enforce masking policies using secure views

boolean

true

No

Specifies whether to use secure view based masking. The default value is true.

Enforce row filter policies using secure views

boolean

true

No

Specifies whether to use secure view based row filtering. The default value is true.

While Databricks SQL supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

Create secure view for all tables/views

boolean

true

No

Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

Default masked value for numeric datatype columns

integer

0

No

Specifies the default masking value for numeric column types.

Default masked value for text/varchar/string datatype columns

string

<MASKED>

No

Specifies the default masking value for text and string column types.

Secure view name prefix

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

Secure view name postfix

string

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

Secure view database name prefix

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

For example, if the prefix is priv_, then the secure view name for a database named example1 is priv_example1.

Secure view database name postfix

string

_secure

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

For example, if the postfix is _sec, then the secure view name for a database named example1 is example1_sec.

Enable dataadmin

boolean

true

No

This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

Users to exclude when fetching access audits

string

{{DATABRICKS_SQL_ANALYTICS_JDBC_USERNAME}}

No

Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".



Custom fields

Table 7. Custom fields

Canonical name

Type

Default

Description

load.resources

string

load_like

Specifies how PolicySync loads resources from Databricks SQL. The following values are allowed:

  • load_like: Default value for loading resources, to be used in production.

  • load: Load resources from Databricks SQL with a top-down resources approach, that is, it first loads the database and then the schemas followed by tables and its columns. This mode is only for development purposes.

sync.interval.sec

integer

60

Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

sync.serviceuser.interval.sec

integer

420

Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

sync.servicepolicy.interval.sec

integer

540

Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

audit.interval.sec

integer

30

Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

user.name.case.conversion

string

lower

Specifies how user name conversions are performed. The following options are valid:

  • lower: Convert to lowercase

  • upper: Convert to uppercase

  • none: Preserve case

This setting applies only if Persist case sensitivity of user names is set to true.

group.name.case.conversion

string

lower

Specifies how group name conversions are performed. The following options are valid:

  • lower: Convert to lowercase

  • upper: Convert to uppercase

  • none: Preserve case

This setting applies only if Persist case sensitivity of group names is set to true.

role.name.case.conversion

string

lower

Specifies how role name conversions are performed. The following options are valid:

  • lower: Convert to lowercase

  • upper: Convert to uppercase

  • none: Preserve case

This setting applies only if Persist case sensitivity of role names is set to true.

secure.view.name.remove.suffix.list

string

Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

secure.view.database.name.remove.suffix.list

string

Specifies a suffix to remove from a database name. For example, if the database is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

audit.init.starttime.offset.minutes

integer

30

Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Databricks SQL.



Databricks SQL Hive Service Definition

To use Databricks SQL with Privacera Hive requires Hive-specific configuration in following steps:

  1. To use Databricks SQL with Privacera Hive, you need to connect Databricks application which internally creates privacera_hive. You need to connect the Databricks application, enable access, and save it.

  2. Additionally, configure the following properties for Hive when you Connect application.

    • In the System config field, add the following value:

      privacera-databricks_sql_analytics-hive-system-config.json
                
    • In the ADVANCED tab, add the following properties. This example uses the number 4 as the connector key.

      ranger.policysync.connector.4.ranger.service.appid=privacera_hive
      ranger.policysync.connector.4.ranger.service.name=privacera_hive
                

Note

Prior to PrivaceraCloud version 4.2, if you have experienced that PolicySync with databricks_sql_analytics or hive service did not handle Ranger user/group/roles updates, add the following property where the number 4 is the connector key. This will push the new users to the Databricks workspace forcefully.

ranger.policysync.connector.4.force.update.principal=true

Hive-to-Databricks SQL Permission Mapping

Hive Permission

Databricks SQL Permission

Select

Usage, ReadMetadata, Select

Update

Usage, modify

Create in the database

Usage, Create in the database

Create on the UDF

Usage, CreateNamedFunction

Drop

No equivalent

Alter

No equivalent

Databricks SQL Masking Functions

Masking Function

Scope in Databricks SQL

Default

Value: Default values given as masked properties

Data type: All

Null

Value: Null

Data type: All

Unmasked

Value: Actual value

Data type: All

Hash DBX

Value: Hashed value

Data type: text/varchar

MASK_MD5

Value: Hashed value

Data type: text/string

Regex

Value: Replace value

Data type: text/string

Literal Mask

Value: Replace value

Data type: text/string

Partial last 4 characters

Value: Replace value

Data type: text/string

Partial first 4 characters

Value: Replace value

Data type: text/string

Custom

Value: The UDF given as the input.

Data type: All. For example, repeat('xy', 5)

Databricks SQL Encryption

The following steps enable use of Privacera encryption services in a Databricks SQL notebook:

  • Create a secret shared by Privacera Encryption Gateway (PEG) and Databricks.

  • Create Resource Policies in Privacera for data access to Databricks SQL resources.

  • Create Privacera encryption and decryption User-Defined Functions (UDFs) in Databricks.

For more information about Privacera encryption schemes, see the Privacera Encryption Guide.

Prerequisites
Grant permission in encryption scheme policy

To use Databricks SQL encryption, you must create a scheme policy for a user that will use the Databricks UDF. This scheme policy must grant the getSchemes permission. See Create Scheme Policies on PrivaceraCloud to learn more.

Configure Databricks
Create Databricks secrets

With the Databricks CLI:

  1. Create a secret scope called privaceracloud:

    databricks secrets create-scope --scope privaceracloud
  2. Add secrets to this scope:

    • peg_username, peg_password, and peg_secret are literals and should be entered exactly as shown.

    • The <username>, <password>, and <sharedsecret> values below are the same as what you entered in PrivaceraCloud when adding the PEG service. See API Key to learn more.

      databricks secrets put --scope privaceracloud --key peg_username --string-value <username>
      databricks secrets put --scope privaceracloud --key peg_password --string-value <password> 
      databricks secrets put --scope privaceracloud --key peg_secret --string-value <sharedsecret>
Add Privacera environment variables to Databricks cluster

Add the following environment variables in your Databricks cluster:

PEG_SECRET={{secrets/privaceracloud/peg_secret}}
PEG_PASSWORD={{secrets/privaceracloud/peg_password}}
PEG_USERNAME={{secrets/privaceracloud/peg_username}}

Caution

Note that there can be existing environment variables. Do not remove these.

Image 220070
Create Privacera protect and unprotect User-Defined Functions (UDFs)

First log into Databricks, create a notebook, and set the language to SQL.

Run the following SQL commands in Databricks to create UDFs for Privacera encryption services, named protect and unprotect.

Note

com.privacera.crypto functions enable use of encryption schemes, but do not accept presentation schemes.

  • Create Privacera protect UDF:

    create database if not exists privacera;
    use privacera;
    drop function if exists privacera.protect;
    CREATE FUNCTION privacera.protect AS com.privacera.crypto.PrivaceraEncryptUDF'
  • Create Privacera unprotect UDF:

    use privacera;
    drop function if exists privacera.unprotect;
    CREATE FUNCTION privacera.unprotect AS com.privacera.crypto.PrivaceraDecryptUDF'
Configure Privacera resource policies

Databricks SQL resources are managed under Access Manager > Resource Policies > privacera_hive.

To add resource policies to allow access to selected resources:

  1. Create a policy to give data access users, groups, or roles the select privilege to target database resources. On the Add Policy page, under Allow Conditions use Select Role, Select Group and/or Select User then under Permissions choose select.

    For example:

    Image 220071
  2. Create a policy to grant data access users, groups, or roles the select privilege to the protect and unprotect UDFs. On the Add Policy page, under Allow Conditions use Select Role, Select Group and/or Select User then under Permissions choose select.

    For example:

    Image 220072
How to use UDFs in SQL to encrypt and decrypt

The following are SQL command examples for privacera.protect (encrypt) and privacera.unprotect (decrypt) UDFs:

privacera.protect
select privacera.protect(<COLNAME>,'<ENCRYPTION_SCHEME_NAME>') from <DB_NAME>.<TABLE_NAME>;
  • <COLNAME> is the identifier of the column to encrypt.

  • <ENCRYPTION_SCHEME_NAME> is the name of the chosen Privacera encryption scheme.

  • <DB_NAME>.<TABLE_NAME> are the names of the database and table in that database.

Example

In this example, the email column of the bigdatabase.customer_data table is encrypted with the SYSTEM_EMAIL encryption scheme.

select privacera.protect(email, \'SYSTEM\_EMAIL\') from bigdatabase.customer\_data;
privacera.unprotect
select privacera.unprotect(<COLNAME>,'<ENCRYPTION_SCHEME_NAME>') from <DB_NAME>.<TABLE_NAME>;
  • <COLNAME> is the identifier of the column to decrypt.

  • <ENCRYPTION_SCHEME_NAME> is the name of the chosen Privacera encryption scheme, which must be the same encryption scheme used to originally encrypt.

  • <DB_NAME>.<TABLE_NAME> are the names of the database and table in that database.

Example

In this example, the email column of the bigdatabase.customer_data table is decrypted with the SYSTEM_EMAIL encryption scheme.

select privacera.unprotect(email, 'SYSTEM_EMAIL') from bigdatabase.customer_data;
privacera.unprotect with optional presentation scheme

The unprotect UDF supports an optional specification of a presentation scheme that further obfuscates the decrypted data.

For an example of data transformation with the optional presentation scheme, see Example of Data Transformation with /unprotect and Presentation Scheme..

Example query:

select id, privacera.unprotect(<COLUMN_NAME>, <ENCRYPTION_SCHEME_NAME>, <PRESENTATION_SCHEME_NAME>) <OPTIONAL_NAME_FOR_COLUMN_TO_WRITE_OBFUSCATED_OUPUT> from <DB_NAME>.<TABLE_NAME>;
  • <PRESENTATION_SCHEME_NAME> is the name of the chosen Privacera presentation scheme with which to further obfuscate the decrypted data.

  • <OPTIONAL_NAME_FOR_COLUMN_TO_WRITE_OBFUSCATED_OUTPUT> is a "pretty" name for the column that the obfuscated data is written to.

  • Other arguments are the same as in the preceding unprotect example.

Dremio

This topic describes how to connect a Dremio application to PrivaceraCloud.

Prerequisite

There must be Dremio host where Dremio Enterprise Edition is installed.

Note

Community Edition is not supported

Connect Application
  1. Go the Setting > Applications.

  2. In the Applications screen, select dremio.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable Access Management for your application

  5. Click Download Script (to download the privacera_dremio_plugin.sh)

  6. Click Save.

    Note

    If required download privacera_dremio_plugin.sh again using edit application option.

Configure Privacera plugin

Configure Privacera plugin depending on the installation of Dremio on your instance.

Note

For a new/existing data source configured in Dremio Data Lake, ensure Enable external authorization plugin checkbox under Settings &gt; Advanced Options of the data source is selected in the Dremio UI. Then, restart the Dremio service.

RPM
  1. SSH to your instance where Dremio RPM is installed

  2. Copy the downloaded privacera_dremio_plugin.sh file to the Home folder in your Dremio instance.

  3. Run the following commands:

    mkdir -p ~/privacera/install 
    mv privacera_dremio_plugin.sh ~/privacera/install
  4. Launch the privacera_dremio_plugin.sh script.

    cd ~/privacera/instal 
    chmod +x privacera_dremio_plugin.sh 
    sudo ./privacera_dremio_plugin.sh
  5. Update dremio envornment to add Privacera jars and configuration in the Dremio classpath.

    vi ${DREMIO_HOME}/conf/dremio-env
  6. Update the following variable if it exists or add it.

    DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*
  7. Restart Dremio.

    sudo service dremio restart
Kubernetes

Depending on your cloud provider, set up Dremio in a Kubernetes environment.

See the following links for deployment:

After setting up Dremio, perform the following steps to deploy the Privacera plugin.

  1. SSH to your instance where Dremio is installed containing the Dremio Kubernetes artifacts and change to the dremio-cloud-tools/charts/dremio_v2/ directory.

  2. Copy the privacera_dremio_plugin.sh downloaded file to the dremio_v2 folder in your Dremio Kubernetes instance.

  3. Run the following commands:

  4. Update configmap.yml to add new configmap for Privacera configuration.

    vi templates/dremio-configmap.yaml

    Add the following configuration at the start of the file.

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: dremio-privacera-install
    data:
      privacera_dremio_plugin.sh: |- {{ .Files.Get "privacera_config/privacera_dremio_plugin.sh" | nindent 4 }}
    ---
  5. Update dremio-env to add Privacera jars and configuration in the Dremio classpath.

    vi config/dremio-env

    Update the following variable if it exists or add it.

    DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*  v
  6. Update values.yaml.

    vi values.yaml

    Add the following configuration for extraInitContainers inside the coordinator section.

    extraInitContainers:  |
        - name: install-privacera-dremio-plugin
          image: {{.Values.image}}:{{.Values.imageTag}}
          imagePullPolicy: IfNotPresent
          securityContext:
            runAsUser: 0
          volumeMounts:
          - name: dremio-privacera-plugin-volume
            mountPath: /opt/dremio/plugins/authorizer
          - name: dremio-ext-jars-volume
            mountPath: /opt/privacera/dremio-ext-jars
          - name: dremio-privacera-config
            mountPath: /opt/privacera/conf/
          - name: dremio-privacera-install
            mountPath: /opt/privacera/mount
          command:
            - "bash"
            - "-c"
            - "cd /opt/privacera/mount/ && cp * /tmp/ && cd /tmp && ./privacera_dremio_plugin.sh"extraInitContainers:  |    - "cd /opt/privacera/mount/ &amp;&amp; cp * /tmp/ &amp;&amp; cd /tmp &amp;&amp; ./privacera_dremio_plugin.sh"

    Update or uncomment the extraVolumes section inside the coordinator section and add the following configuration:

    extraVolumes:
        - name: dremio-privacera-install
          configMap:
            name: dremio-privacera-install
            defaultMode: 0777
        - name: dremio-privacera-plugin-volume
          emptyDir: {}
        - name: dremio-ext-jars-volume
          emptyDir: {}
        - name: dremio-privacera-config
          emptyDir: {

    Update or uncomment the extraVolumeMounts section inside the coordinator section and add the following configuration:

    extraVolumeMounts:
        - name: dremio-ext-jars-volume
          mountPath: /opt/privacera/dremio-ext-jars
        - name: dremio-privacera-plugin-volume
          mountPath: /opt/dremio/plugins/authorizer
        - name: dremio-privacera-config
          mountPath: /opt/privacera/conf
  7. Upgrade your Helm release. Get the release name by running helm list command. The text under the Name column is Helm release.

    helm upgrade -f values.yaml <release-name> .
DynamoDB

This topic describes how to connect DynamoDB application to PrivaceraCloud.

Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.

Prerequisites in AWS console

The following prerequisites must be met:

  1. Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.

  2. Configure a Trust relationship with PrivaceraCloud See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select DynamoDB.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management with the toggle buttons.

Enable Privacera Access Management
  1. Click the toggle button to enable Privacera Access Management for your application.

  2. On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key.

      2. AWS Secret Key: AWS data repository host account Secret Key

      3. AWS Region: AWS S3 bucket region.

    • With Use IAM Role enabled:

      1. AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.

      2. AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.

        Note

        The external ID is stored encrypted. It is never reflected back to the UI or is made visible.

      3. AWS Region: AWS S3 bucket region.

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

  6. Recommended: Install the AWS CLI.

    Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.

  7. Recommended: Validate connectivity by running AWS CLI for DynamoDB such as:

    aws dynamodb list-tables
Elastic MapReduce from Amazon
EMR: Hive, PrestoDB, PrestoSQL

This topic describes how to connect an EMR application to PrivaceraCloud.

Note

PrivaceraCloud supports EMR versions 6.x and higher with Kerberos enabled.

Connect application
  1. Go the Settings > Applications.

  2. In the Applications screen, select EMR.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable Access Management for your application.

Obtain installation script
  1. In the Edit Application screen, click the Copy URL button to obtain installation script.

    Save this value, it will be needed for the <emr-script-download-url> later on.

    EMR clusters can be connected to the PrivaceraCloud in two ways:

    • Attach PrivaceraCloud authorization in new EMR clusters.

    • Attach PrivaceraCloud authorization in an existing EMR cluster.

    Both methods start with obtaining an account-specific script from your PrivaceraCloud account, followed by adding a startup step to your EMR cluster.

    Notice

    PrestoDB by default blocks few operations on Hive catalog. This can be enabled by updating hive.properties.

  2. Click Save.

You can now use PrivaceraCloud to define fine-grained policies and control access to Hive and Presto resources within the EMR cluster.

Configure EMR cluster

From your AWS EMR web console:

  1. Open your AWS EMR cluster, then:

    1. For new EMR clusters , go to Create EMR > Advanced Options and click Go to advanced options.

    2. For existing EMR clusters, locate and the open the existing cluster for configuration update. Open the Steps tab and click Add Step.

  2. In the Add Step dialog, complete the fields as follows:

    Step type: Custom JAR

    Name: Install PrivaceraCloud Plugin

    JAR location: command-runner.jar

    Arguments:

    bash -c "wget <emr-script-download-url> ; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh"

    Action on failure: Terminate cluster

The EMR Hive plug-in supports view-level access management via the Data_admin feature. By default it supports view-based row-Level filtering and column masking.

  • This plug-in also supports View-level Access Management using Data_admin feature and View-based Row-Level Filtering and Column Masking features.

  • By default, the PrestoSQL plug-in on EMR will use policies from privacera-hive repository for Access Management.

Validate installation

In PrivaceraCloud, open Access Manager: Audit, and click the Plugin tab. Look for audit items reporting the status "Policies synced to plugin. This indicates that your EMR Hive, Presto, or Spark data resource is connected.

EMR Spark (Fine-Grained Access Control)

These instructions enable Fine-Grained Access Control (FGAC) for an existing connected AWS S3 data resource. FGAC enables policies at the database, table, and column level to be defined in service "privacera_hive" in Access Manager: Resource Policies. Either Object Level Acess Control (OLAC) or Fine-Grained Access Control (FGAC) can be added to an existing AWS S3 configuration but not both.

Once installed and enabled, each data user query is first parsed by Spark and authenticated by PrivaceraCloud Spark Plug-In. The requesting user must have authenticated access to all resources referenced by the query for it to be allowed.

  1. In PrivaceraCloud, obtain your account unique call-in <emr-script-download-url> to allow the EMR cluster to obtain additional scripts and setup.

    1. Open Settings > API Key.

    2. Use an existing active API Key* or generate a new one.

      Caution

      Make sure the Expiry column is set to "Never Expires".

    3. Click the i icon to get the scripts.

    4. Under AWS EMR Setup Script, click Copy Url. Save this value. It will be used as the <emr-script-download-url>, in the following instructions.

  2. From the AWS EMR web console:

    • For new EMR clusters, go to Create EMR > Advanced Options and click Go to advanced options.

    • For existing EMR clusters, locate and the open the existing cluster for configuration update. Open the Steps tab and click Add Step.For new EMR clusters, go to Create EMR > Advanced Options and click Go to advanced options.

    Note

    To add multiple JWT configurations, see How to configure multiple JSON Web Tokens (JWTs) for EMR

  3. Install the Privacera Spark FGAC Plugin:

    1. In a new cluster: select Configure Step > Custom JAR at the bottom of the configuration page.

      For an existing cluster: in Steps, select Custom Jar and click Add Step.

    2. Add the given values in the following fields and click Add.

      • Name: Install PrivaceraCloud Spark Plugin

      • JAR location: command-runner.jar

      • Arguments: add the following command:

        bash -c "wget <emr-script-download-url> 
        chmod +x ./privacera_emr.sh 
        sudo ./privacera_emr.sh spark-fgac"                                         
      • Action on failure: Terminate cluster

    3. (Optional) To specify the custom policy name for hive, spark, or trino services, export the following variable in arguments:

      bash -c "export 
      EMR_HIVE_SERVICE_NAME=<hive_repo_name>; export 
      EMR_TRINO_HIVE_SERVICE_NAME=<trino_hive_repo_name>; export 
      EMR_SPARK_HIVE_SERVICE_NAME=<spark_hive_repo_name>; wget <emr-script-download-url> ; chmod +x ./privacera_emr.sh ; sudo -E ./privacera_emr.sh spark-fgac"

      where:

      hive_repo_nameis a custom hive service name for hive application in EMR.

      spark_hive_repo_nameis a custom hive service name for spark applications in EMR.

      trino_hive_repo_nameis a custom hive service name for trino application in EMR.

Notice

The Privacera plugin also supports view-level access control using Data admin, view-based row-Level filtering and column masking features.

EMR Spark (Object Level Access Control)

These instructions enable Object Level Access Control (OLAC) for existing connected AWS S3 resources. If AWS S3 is not already configured, do so by following the instructions here, then follow these additional configuration steps.

Either Object Level Access Control (OLAC) or Fine-Grained Access Control (FGAC) can be added to an existing AWS S3 configuration, but not both.

Two subcomponents are installed:

  • Privacera Credential Token Service (P-CTS) is installed to the targeted AWS EMR master node. P-CTS is a secure service running on an EMR master node which provides encrypted access tokens to the requesting user. Tokens are encrypted using a shared secret key with the Privacera Cloud Signing Server.

  • Privacera Signing Agent (P-SA) installed to targeted AWS EMR worker nodes. P-SA redirects Spark S3 requests to the Privacera Cloud Signing Server with a P-CTS access token in the request. P-SA then provides the appropriate signed response to Spark for accessing the S3 data if:

    (a) The incoming request has a valid P-CTS token;

    and (b) The requesting user has permissions on the S3 resource as defined in the “privacera_s3“ service in Access Manager: Resource Policies.

These steps will:

  1. Create an AWS Kerberos-based Security Configuration.

  2. Establish a shared secret between PrivaceraCloud and the AWS EMR Kerberos based Security Configuration.

  3. Create a new AWS cluster configured to use that Security Configuration. That cluster will link back to the Privacera Signing Agent (P-SA) and Privacera Credential Token Service (P-CTS).

Prerequisites
  1. Obtain or determine a character string to serve as a "shared key" between PrivaceraCloud and the AWS EMR cluster. We'll refer to this as <SHARED_KEY> in the configuration steps below.

  2. Obtain your account unique call-in <emr-script-download-url> to allow the EMR cluster to obtain additional scripts and setup from PrivaceraCloud:

    1. Open Settings: Api Key.

    2. Use an existing Active Api Key or create a new one. Set Expiry = Never Expires.

    3. Open the Api Key Info box (click the (i) in the key row).

    4. Copy and store as <emr-script-download-url> using the Copy Url link found under AWS EMR Setup Script.

PrivaceraCloud configuration steps
  1. In PrivaceraCloud console, Setting: Application, select the existing AWS Data Server application (S3 or Athena), and click the edit (pen) icon.

  2. In the the ADVANCED tab, add the following property:

    dataserver.shared.secret=<SHARED_KEY>
  3. Click Save.

AWS configuration steps
  1. Create an EMR Security Configuration for Kerberos Authentication:

    1. Open your AWS EMR web console.

    2. Click  Security Configurations, then Create.

    3. Provide a name for this Security Configuration such as PRIVACERA_KDC. We'll refer to this same Security Configuration later.

    4. Under Authentication, select Enable Kerberos authentication and complete the fields as appropriate for your environment.

  2. Create a new EMR cluster and assign to it the new Security Configuration.

    1. In the AWS EMR Console, create a new cluster.

    2. In Advanced Options, click Go to advanced options.

    3. In the Software Configuration, select the appropriate EMR release and any associated applications.

    4. In Edit Software Settings, select Enter configuration, and add the following properties:

      [ { "classification":"spark-defaults", "properties":{ "spark.driver.extraJavaOptions":"-javaagent:/usr/lib/spark/jars/privacera-signing-agent.jar", "spark.executor.extraJavaOptions":"-javaagent:/usr/lib/spark/jars/privacera-signing-agent.jar", } } ]
    5. In Steps, select Custom Jar and click Add Step.

      Add code to download and install the Privacera Credential Token Service. Complete the fields as below substituting your <emr-sript-download-url>, value in the wget command below. Click Add when all fields are complete.

      • Name: ``Install Privacera CTS```

      • JAR location: command-runner.jar

      • Arguments:bash -c "wget &lt;emr-script-download-url&gt; ; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh priv-cts"

      • Action on failure: Continue

      Click Next.

  3. Configure hardware by selecting values Networking, Node, and Instance values as appropriate for your environment.

  4. Configure general cluster settings by adding two scripts that will Install Privacera Signing Agent on master and worker nodes.

    1. Assign Cluster name, Logging, Debugging, and Termination protection as appropriate for your environment.

    2. Install the Master signing agent:

      1. Go to Additional Options > Bootstrap Actions and select bootstrap action "Run if" and click Configure and add to open the Add Bootstrap Action dialog.

      2. In this dialog set the name to Privacera Signing Agent for Master, copy the following script into Optional Arguments the and click Add when done. Replace <emr-script-download-url> with your own value.

        instance.isMaster=true "wget <emr-script-download-url>; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh spark-fbac"
      3. The Worker signing agent is installed in the same way. Under Additional Options, expand Bootstrap Actions, select bootstrap action "Run if" and click Configure and add to open the Add Bootstrap Action dialog. In this dialog set the name to Privacera Signing Agent for Worker, copy the following script into Optional Arguments . Replace <emr-script-download-url> with your own value.

        instance.isMaster=false "wget <emr-script-download-url>; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh spark-fbac"
  5. Configure security options

    1. Complete Security Options as appropriate for your environment.

    2. Open Security Configuration, and select the configuration you created earlier, e.g. "PRIVACERA_KDC". Then n the following fields, enter values:

      • Realm

      • KDC admin password

  6. Click Create cluster to complete.

EMRFS S3

This topic describes how to connect EMRFS S3 application to PrivaceraCloud. You only need to enable Access Management to start controlling access on EMRFS S3.

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select EMRFS S3.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable the Access Management for EMRFS S3.

    The message displays, Save the setting to start controlling access on EMRFS S3.

  5. Click Save.

Files

This topic describes how to connect Files to PrivaceraCloud. You only need to enable Access Management to start controlling access on Files.

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select Files.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable the Access Management for Files.

    The message displays, Save the setting to start controlling access on Files.

  5. Click Save.

File Explorer for Google Cloud Storage

This topic describes how to connect Google Cloud Storage (GCS) to PrivaceraCloud. You only need to enable Access Management to control access to data on GCS and enable the File Explorer.

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select GCS.

  3. Enter the application Name and Description, and then click Save.

  4. On the BASIC tab, enter the following JSON for the Google Cloud Storage Account Credential.

     {
      "type": "service_account",
      "project_id": "MyProjectID",
      "private_key_id": "c97****b5",
      "private_key": "-----BEGIN PRIVATE KEY-----\nMII***r\nJA4RFEHkNOwuQ****FM\n-----END PRIVATE KEY-----\n",
      "client_email": "abc@developer.gserviceaccount.com",
      "client_id": "1**8372",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://oauth2.googleapis.com/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/5**7-compute%40developer.gserviceaccount.com"
    }
            
  5. To validate the credentials, click Test Connection.

  6. Click Save..

Using File Explorer with GCS

Go to Data Inventory > File Explorer and select your GCS data.

Glue

This topic describes how to connect the Glue application to PrivaceraCloud. You only need to enable Access Management to start controlling access on Glue.

Prerequisites

Connect the S3 application to the PrivaceraCloud before connecting the Glue application.

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select Glue.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable Access Management for Glue.

    The message displays, Save the setting to start controlling access on Glue.

  5. Click Save.

Enable Privacera Access Management
  1. Click the toggle button to enable Privacera Access Management for your application.

  2. On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key.

      2. AWS Secret Key: AWS data repository host account Secret Key

      3. AWS Region: AWS S3 bucket region.

    • With Use IAM Role enabled:

      1. AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.

      2. AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.

        Note

        The external ID is stored encrypted. It is never reflected back to the UI or is made visible.

      3. AWS Region: AWS S3 bucket region.

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

  6. Recommended: Install the AWS CLI.

    Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.

  7. Recommended: Validate connectivity by running AWS CLI for Glue such as:

    aws glue get-catalog-import-status
Google BigQuery

This topic describes how to connect a Power BIapplication to PrivaceraCloud.

Connect Application
  1. Go to Settings -> Applications.

  2. On the Applications screen, select Power BI.

  3. Enter the application Name and Description, and then click SAVE.

  4. Click the toggle button to enable Access Management for Power BI.

  5. In the BASIC tab, enter the values in the required(*) fields and click SAVE.

  6. In the ADVANCED tab, you can add custom properties.

    Caution

    Advanced properties should be modified in consultation with Privacera.

  7. Click the IMPORT PROPERTIES link to browse and import application properties.

Connector Properties

Basic fields

Table 8. Basic fields

Field name

Type

Default

Required

Description

BigQuery project location

string

us

Yes

Specifies the geographical region where the taxonomy for the PolicySync should be created.

BigQuery project id

string

Yes

Specifies the Google project ID where your Google BigQuery data source resides. For example: privacera-demo-project.

Service account email

string

Yes

Specifies the service account email address that PolicySync uses. You must specify this value if you are not using a Google Cloud Platform (GCP) virtual machine attached service account.

BigQuery private key content

string

No

Specifies the Google Cloud Platform (GCP) account credential key JSON content. PolicySync uses this data to connect to Google BigQuery.

Projects to set access control policies

string

Yes

Specifies a comma-separated list of project names to which access control is managed by PolicySync. If unset, PolicySync manages all projects. If specified, use the following format. You can use wildcards. Names are case-sensitive.

The list of projects to ignore takes precedence over any projects specified by this setting.

An example list of projects might resemble the following: testproject1,testproject2,sales_project*.

Native public group identity name

string

Yes

Set this property to your preferred value, policysync uses this native public group for access grants whenever there is policy created referring to public group inside it. The following values are allowed:

  • ALL_AUTHENTICATED_USERS: All gcp project authenticated users.

  • ALL_USERS: All google authenticated users.

Enable audit

boolean

false

Yes

Specifies whether Privacera fetches access audit data from the data source.



Advanced fields

Table 9. Advanced fields

Field name

Type

Default

Required

Description

Create custom iam roles in gcp

boolean

true

No

Specifies whether PolicySync automatically creates custom IAM roles in your Google Cloud Platform project or organization for fine-grained access control (FGAC). If set to false, you must create all required custom IAM roles manually in your GCP project or organization. The default value is true.

GCP custom iam roles scope

string

project

No

Specifies whether PolicySync creates and uses custom IAM roles at the project or organizational level in Google Cloud Platform (GCP). The following values are allowed:

  • project: Create and use custom IAM roles from each individual project level.

  • org: Create and use custom IAM roles at the organizational level.

GCP organization id

string

No

Specifies the Google Cloud Platform (GCP) organizational ID. Specify this only if you configured PolicySync to use custom IAM roles at the organizational level.

Datasets to set access control policies

string

Yes

Specifies a list of comma-separated datasets that PolicySync manages access control to. You can use wildcards in the value. Names are case-sensitive. If you want to manage all datasets, do not set a value. For example:

testproject1.dataset1,testproject2.dataset2,sales_project*.sales*

You can configure the postfix by specifying Secure view dataset name postfix.

If specified, the Datasets to ignore while setting access control policies setting takes precedence over this setting.

Tables to set access control policies

string

No

Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards.

Use the following format when specifying a table:

<PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME>

If specified, Tables to ignore while setting access control policies takes precedence over this setting.

If you specify a wildcard, such as in the following example, all matched tables are managed:

<PROJECT_NAME>.<DATASET_NAME>.*

The specified value, if any, is interpreted in the following ways:

  • If unset, access control is managed for all datasets.

  • If set to none no datasets are managed.

Projects to ignore while setting access control policies

string

No

Specifies a comma-separated list of project names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all projects are subject to access control.

For example: testproject1,testproject2,sales_project*.

This setting supersedes any values specified by Projects to set access control policies.

Datasets to ignore while setting access control policies

string

No

Specifies a comma-separated list of dataset names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all datasets are subject to access control.

For example: testproject1.dataset1,testproject2.dataset2,sales_project*.sales*.

This setting supersedes any values specified by Datasets to set access control policies.

Tables to ignore while setting access control policies

string

No

Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all tables are subject to access control. Specify tables using the following format:

<PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME>

This setting supersedes any values specified by Tables to set access control policies.

Users to set access control policies

string

No

Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

If not specified, PolicySync manages access control for all users.

If specified, Users to be ignored by access control policies takes precedence over this setting.

An example user list might resemble the following: user1,user2,dev_user*.

Groups to set access control policies

string

No

Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: group1,group2,dev_group*.

If specified, Groups to be ignored by access control policies takes precedence over this setting.

Users to be ignored by access control policies

string

No

Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

This setting supersedes any values specified by Users to set access control policies.

Groups to be ignored by access control policies

string

No

Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

This setting supersedes any values specified by Groups to set access control policies.

Set access control policies only on the users from managed groups

boolean

false

No

Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.

Enforce bigquery native row filter

boolean

false

No

Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.

Enforce masking policies using secure views

boolean

true

No

Specifies whether to use secure view based masking. The default value is true.

Enforce row filter policies using secure views

boolean

true

No

Specifies whether to use secure view based row filtering. The default value is true.

While Google BigQuery supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

Create secure view for all tables/views

boolean

true

No

Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

Default masking value for numeric datatype

integer

0

No

Specifies the masking value used for numeric data types.

Default masking value for text/string datatype

string

<MASKED>

No

Specifies the masking value used for text or string data types.

Secure view name prefix

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

Secure view name postfix

string

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

Secure view dataset name prefix

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a dataset named example1 is dev_example1.

Secure view dataset name postfix

string

_secure

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a dataset named example1 is example1_dev.

Enable this for policy enforcements and user/group/role management.

boolean

true

Yes

Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

Enable to use data admin functionality.

boolean

true

No

This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

ignore audit for users

string

No

Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".

project id used to fetch BigQuery audits

string

No

Specifies the project ID where Google BigQuery stores audit log data.

dataset used to fetch BigQuery audits

string

No

Specifies the name of the dataset where Google BigQuery logs audit data. Privacera uses this data for running audit queries.



Custom fields

Table 10. Custom fields

Canonical name

Type

Default

Description

use.vm.credentials

boolean

false

Specifies whether the PolicySync uses the service account attached to your virtual machine for the credentials to connect to the data source.

custom.iam.roles.name.mapping

string

Specifies a list of mappings between PolicySync custom IAM role names and your custom role names. Use the following format when specifying your custom role names:

<PRIVACERA_DEFAULT_ROLE_NAME_1>:<CUSTOM_ROLE_NAME_1>
<PRIVACERA_DEFAULT_ROLE_NAME_2>:<CUSTOM_ROLE_NAME_2>

The following is a list of the default custom role names:

  • PrivaceraGBQProjectListRole

  • PrivaceraGBQJobListRole

  • PrivaceraGBQJobListAllRole

  • PrivaceraGBQJobCreateRole

  • PrivaceraGBQJobGetRole

  • PrivaceraGBQJobUpdateRole

  • PrivaceraGBQJobDeleteRole

  • PrivaceraGBQDatasetCreateRole

  • PrivaceraGBQDatasetGetMetadataRole

  • PrivaceraGBQDatasetUpdateRole

  • PrivaceraGBQDatasetDeleteRole

  • PrivaceraGBQTableListRole

  • PrivaceraGBQTableCreateRole

  • PrivaceraGBQTableGetMetadataRole

  • PrivaceraGBQTableQueryRole

  • PrivaceraGBQTableExportRole

  • PrivaceraGBQTableUpdateMetadataRole

  • PrivaceraGBQTableUpdateRole

  • PrivaceraGBQTableSetCategoryRole

  • PrivaceraGBQTableDeleteRole

  • PrivaceraGBQTransferUpdateRole

  • PrivaceraGBQTransferGetRole

load.resources

string

load_from_dataset_columns

Specifies how PolicySync loads resources from Google BigQuery. The following values are allowed:

  • load_md: Load resources from Google BigQuery with a top-down resources approach, that is, it first loads the project and then the dataset followed by tables and its columns.

  • load_from_dataset_columns: Load resources one by one for each resource type that is, it loads all projects first, then it loads all datasets in all projects, followed by all tables in all datasets and its columns. This mode is recommended since it is faster than the load mode.

sync.interval.sec

integer

60

Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

sync.serviceuser.interval.sec

integer

420

Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

sync.servicepolicy.interval.sec

integer

540

Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

audit.interval.sec

integer

30

Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

user.name.replace.from.regex

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the user.name.replace.to.string setting.

If not specified, no find and replace operation is performed.

user.name.replace.to.string

string

_

Specifies a string to replace the characters matched by the regex specified by the user.name.replace.from.regex setting.

If not specified, no find and replace operation is performed.

group.name.replace.from.regex

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the group.name.replace.to.string setting.

If not specified, no find and replace operation is performed.

group.name.replace.to.string

string

_

Specifies a string to replace the characters matched by the regex specified by the group.name.replace.from.regex setting.

If not specified, no find and replace operation is performed.

column.access.control.type

string

view

Specifies how PolicySync manages column-level access control. The following values are allowed:

  • view: Use view-based column level access control. Any columns that a user cannot access appears as null in the secure view of the table or the secure view of the native view.

policy.name.separator

string

_

Specifies a string to use as part of the name of native row filter and masking policies.

row.filter.policy.name.template

string

row_filter_item_

Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table data from the ds dataset that resides in the proj project, the row filter policy name might resemble the following:

proj_priv_ds_priv_data_<ROW_FILTER_ITEM_NUMBER>

masking.functions.dataset.name

string

privacera_dataset

Specifies the name of the dataset where PolicySync creates custom masking functions.

secure.view.name.remove.suffix.list

string

Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

secure.view.dataset.name.remove.suffix.list

string

Specifies a suffix to remove from a secure view dataset name. For example, if the dataset is named some_name_ds you can remove the _ds string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes, such as _raw,_qa,_prod.

authorized.view.acl.updater.interval.sec

integer

10

Specifies the interval at which the authorized view ACLs updater thread updates the permissions in the dataset if any permission updates are pending.

perform.grant.updates.max.retry.attempts

integer

2

Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is 2.

perform.grant.updates.batch

boolean

true

Specifies whether PolicySync applies grants and revokes in batches. If enabled, this behavior improves overall performance of applying permission changes.

audit.log.load.max.interval.minutes

integer

30

Specifies the maximum interval, in minutes, of the time window that SQL queries use to retrieve access audit information. If there are a large number of audits records, narrowing the window interval improves performance.

For example, if the interval is set to 30, SQL queries similar to the following are executed:

SELECT * FROM audits where time_from=00:01 and time_to=00:30;
SELECT * FROM audits where time_from=00:31 and time_to=01:00;
SELECT * FROM audits where time_from=01:01 and time_to=01:30;



Kinesis

This topic describes how to connect Kinesis application to PrivaceraCloud.

Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.

Prerequisites

Connect the S3 application to the PrivaceraCloud before connecting the Kinesis application.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Kinesis.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management with the toggle buttons.

Enable Privacera Access Management
  1. Click the toggle button to enable Privacera Access Management for your application.

  2. On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key.

      2. AWS Secret Key: AWS data repository host account Secret Key

      3. AWS Region: AWS S3 bucket region.

    • With Use IAM Role enabled:

      1. AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.

      2. AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.

        Note

        The external ID is stored encrypted. It is never reflected back to the UI or is made visible.

      3. AWS Region: AWS S3 bucket region.

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

  6. Recommended: Install the AWS CLI.

    Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.

  7. Recommended: Validate connectivity by running AWS CLI for Kinesis such as:

    aws kinesis list-streams
Lambda

This topic describes how to connect Lambda application to PrivaceraCloud.

Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.

Prerequisites in AWS console

The following prerequisites must be met:

  1. Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.

  2. Configure a Trust relationship with PrivaceraCloud See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Lambda.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management with the toggle buttons.

Enable Privacera Access Management
  1. Click the toggle button to enable Privacera Access Management for your application.

  2. On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key.

      2. AWS Secret Key: AWS data repository host account Secret Key

      3. AWS Region: AWS S3 bucket region.

    • With Use IAM Role enabled:

      1. AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.

      2. AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.

        Note

        The external ID is stored encrypted. It is never reflected back to the UI or is made visible.

      3. AWS Region: AWS S3 bucket region.

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

  6. Recommended: Install the AWS CLI.

    Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.

  7. Recommended: Validate connectivity by running AWS CLI for Lambda such as:

    aws lambda list-functions
Microsoft SQL Server

This topic describes how to connect Microsoft SQL (MSSQL) application to PrivaceraCloud.

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select MS SQL.

  3. Enter the application Name and Description, and then click Save.

    You can see Access Management and Data Discovery with toggle buttons.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see Discovery.

Enable Access Management
  1. Click the toggle button to enable Access Management for MS SQL.

  2. In the BASIC tab, enter the values in the give fields and click Save. For property details and description, see table below:

    Note

    Make sure that the other properties are advanced and should be modified in consultation with Privacera.

    Basic fields

    Table 11. Basic fields

    Field name

    Type

    Default

    Required

    Description

    MSSQL JDBC URL

    string

    Yes

    Specifies the JDBC URL for the Microsoft SQL Server connector.

    Use the following format for the JDBC string:

    jdbc:sqlserver://<JDBC_SQLSERVER_URL_WITH_PORT_NUMBER>
    
    

    MSSQL jdbc username

    string

    Yes

    Specifies the JDBC username to use.

    MSSQL jdbc password

    string

    Yes

    Specifies the JDBC password to use.

    MSSQL master database

    string

    master

    Yes

    Specifies the name of the JDBC master database that PolicySync establishes an initial connection to.

    MSSQL authentication type for the database engine

    string

    SqlPassword

    Yes

    Specifies the authentication type for the database engine. The following types are supported:

    • If the user specified by MSSQL jdbc username is a local user, specify: SqlPassword

    • If the user specified by MSSQL jdbc username is a Microsoft Azure Active Directory user, specify: ActiveDirectoryPassword

    Default password for new mssql user

    string

    Yes

    Specifies the password to use when PolicySync creates new users.

    MSSQL resource owner

    string

    No

    Specifies the role that owns the resources managed by PolicySync.

    • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

    • If a value is specified, the owner of the resource will be changed to the specified value.

    The following resource types are supported:

    • Database

    • Schemas

    • Tables

    • Views

    Enable policy enforcements and user/group/role management

    boolean

    true

    Yes

    Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

    Enable access audits

    boolean

    false

    Yes

    Specifies whether Privacera fetches access audit data from the data source.

    If specified, you must specify a value for the MSSQL Audits storage URL setting.

    MSSQL Audits storage URL

    string

    No

    Specifies the URL for the audit logs provided by the Azure SQL Auditing service. For example: https://test.blob.core.windows.net/sqldbauditlogs/test



    Advanced fields

    Table 12. Advanced fields

    Field name

    Type

    Default

    Required

    Description

    Databases to set access control policies

    string

    No

    Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards.

    An example list of databases might resemble the following: testdb1,testdb2,sales db*.

    If specified, Databases to ignore while setting access control policies takes precedence over this setting.

    Schemas to set access control policies

    string

    No

    Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards.

    Use the following format when specifying a schema:

    <DATABASE_NAME>.<SCHEMA_NAME>

    If specified, Schemas to ignore while setting access control policies takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all schemas are managed:

    <DATABASE_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all schemas.

    • If set to none no schemas are managed.

    Tables to set access control policies

    string

    No

    Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards.

    Use the following format when specifying a table:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    If specified, ignore.table.list takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all matched tables are managed:

    <DATABASE_NAME>.<SCHEMA_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all tables.

    • If set to none no tables are managed.

    Databases to ignore while setting access control policies

    string

    No

    Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all databases are subject to access control.

    For example:

    testdb1,testdb2,sales_db*
    
    

    This setting supersedes any values specified by Databases to set access control policies.

    Schemas to ignore while setting access control policies

    string

    No

    Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all schemas are subject to access control.

    For example:

    testdb1.schema1,testdb2.schema2,sales_db*.sales*
    
    

    This setting supersedes any values specified by Schemas to set access control policies.

    Regex to find special characters in user names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in user names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in group names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in group names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in role names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in role names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.

    If not specified, no find and replace operation is performed.

    Persist case sensitivity of user names

    boolean

    false

    No

    Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

    Persist case sensitivity of group names

    boolean

    false

    No

    Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

    Persist case sensitivity of role names

    boolean

    false

    No

    Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to true, case sensitivity is preserved.

    Manage user from portal

    boolean

    false

    No

    Specifies whether PolicySync maintains user membership in roles in the Microsoft SQL Server data source.

    Manage group from portal

    boolean

    false

    No

    Specifies whether PolicySync creates groups from Privacera in the Microsoft SQL Server data source.

    Manage role from portal

    boolean

    false

    No

    Specifies whether PolicySync creates roles from Privacera in the Microsoft SQL Server data source.

    Users to set access control policies

    string

    No

    Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards.

    If not specified, PolicySync manages access control for all users.

    If specified, Users to be ignored by access control policies takes precedence over this setting.

    An example user list might resemble the following: user1,user2,dev_user*.

    Groups to set access control policies

    string

    No

    Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards.

    An example list of projects might resemble the following: group1,group2,dev_group*.

    If specified, Groups be ignored by access control policies takes precedence over this setting.

    Roles to set access control policies

    string

    No

    Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards.

    An example list of projects might resemble the following: role1,role2,dev_role*.

    If specified, Roles be ignored by access control policies takes precedence over this setting.

    Users to be ignored by access control policies

    string

    No

    Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all users are subject to access control.

    This setting supersedes any values specified by Users to set access control policies.

    Groups be ignored by access control policies

    string

    No

    Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all groups are subject to access control.

    This setting supersedes any values specified by Groups to set access control policies.

    Roles be ignored by access control policies

    string

    No

    Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all roles are subject to access control.

    This setting supersedes any values specified by Roles to set access control policies.

    Prefix of mssql roles for portal users

    string

    priv_user_

    No

    Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named <USER> defined in Privacera and the role prefix is priv_user_, the local role is named priv_user_<USER>.

    Prefix of postgres roles for portal group

    string

    priv_group_

    No

    Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

    Prefix of postgres roles for portal role

    string

    priv_role_

    No

    Specifies the prefix that PolicySync uses when creating roles from Privacera in the Microsoft SQL Server data source.

    For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

    Use mssql native public group for public group access policies

    boolean

    false

    No

    Specifies whether PolicySync uses the Microsoft SQL Server native public group for access grants whenever a policy refers to a public group. The default value is false.

    Set access control policies only on the users from managed groups

    boolean

    false

    No

    Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.

    Set access control policies only on the users/groups from managed roles

    boolean

    false

    No

    Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.

    Enforce MSSQL native row filter

    boolean

    true

    No

    Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.

    Enforce masking policies using secure views

    boolean

    true

    No

    Specifies whether to use secure view based masking. The default value is true.

    Enforce row filter policies using secure views

    boolean

    false

    No

    Specifies whether to use secure view based row filtering. The default value is false.

    While Microsoft SQL Server supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

    Create secure view for all tables/views

    boolean

    false

    No

    Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

    Default masked value for numeric datatype columns

    integer

    0

    No

    Specifies the default masking value for numeric column types.

    Default masked value for text/varchar datatype columns

    string

    <MASKED>

    No

    Specifies the default masking value for text and string column types.

    Secure view name prefix

    string

    No

    Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

    Secure view name postfix

    string

    _secure

    No

    Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

    Secure view schema name prefix

    string

    No

    Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view schema name for a schema named example1 is dev_example1.

    Secure view schema name postfix

    string

    No

    Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a schema named example1 is example1_dev.

    Enable dataadmin

    boolean

    true

    No

    This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.



    Custom fields

    Table 13. Custom fields

    Canonical name

    Type

    Default

    Description

    load.resources

    string

    load_from_database_columns

    Specifies how PolicySync loads resources from Microsoft SQL Server. The following values are allowed:

    • load_md: Load resources from Microsoft SQL Server with a top-down resources approach, that is, it first loads the project and then the database followed by tables and its columns.

    • load_from_database_columns: Load resources one by one for each resource type that is, it loads all databases first, then it loads all schemas in all databases, followed by all tables in all schemas and its columns. This mode is recommended since it is faster than the load mode.

    sync.interval.sec

    integer

    60

    Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

    sync.serviceuser.interval.sec

    integer

    420

    Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

    sync.servicepolicy.interval.sec

    integer

    540

    Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

    audit.interval.sec

    integer

    30

    Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

    ignore.table.list

    string

    Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Specify tables using the following format:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    This setting supersedes any values specified by Tables to set access control policies.

    user.name.case.conversion

    string

    lower

    Specifies how user name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of user names is set to true.

    group.name.case.conversion

    string

    lower

    Specifies how group name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of group names is set to true.

    role.name.case.conversion

    string

    lower

    Specifies how role name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of role names is set to true.

    user.filter.with.email

    string

    Set this property to true if you only want to manage users who have an email address associated with them in the portal.

    masked.date.value

    string

    null

    Specifies the default masking value for date column types.

    secure.view.name.remove.suffix.list

    string

    Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    secure.view.schema.name.remove.suffix.list

    string

    Specifies a suffix to remove from a schema name. For example, if a schema is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    perform.grant.updates.max.retry.attempts

    integer

    2

    Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is 2.

    audit.initial.pull.min

    integer

    30

    Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Microsoft SQL Server.

    load.audits

    string

    load

    Specifies the method that PolicySync uses to load access audit information.

    The following values are valid:

    • load: Use SQL queries

    load.users

    string

    load

    Specifies how PolicySync loads users from Microsoft SQL Server. The following values are valid:

    • load

    • load_db

    external.user.as.internal

    boolean

    false

    Specifies whether PolicySync creates local users for external users.

    manage.group.policy.only

    boolean

    false

    Specifies whether access policies apply to only groups. If enabled, any policies that apply to users or roles are ignored.



  3. In the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

Enable Data Discovery

Click the toggle button to enable the Data Discovery for your application.

  1. In the BASIC tab, enter values in the following fields.

    • JDBC URL

    • JDBC Username 

    • JDBC Password

  2. In the ADVANCED tab, you can add custom properties.

  3. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  4. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

Add data source

To add a resources using this connection as Discovery targets, see Discovery Scan Topics.

MySQL for Discovery

This topic describes how to connect a MySQL application to the PrivaceraCloud Discovery service.

Prerequisites

Before connecting the MySQL application, make sure you have the following information available:

  • JDBC URL

  • JDBC Username 

  • JDBC Password 

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select MySQL.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable Data Discovery for MySQL.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.

  5. In the BASIC tab, enter the values in the following fields:

    • JDBC URL

    • JDBC Username

    • JDBC Password

  6. In the ADVANCED tab, you can add custom properties.

  7. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  8. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

Add data source

To add a resources using this connection as Discovery targets, see Discovery Scan Topics.

Open Source Spark

You first obtain an account-specific script from your PrivaceraCloud account, followed by adding a startup step to open source Spark.

Three configurations are available depending on your requirement. Fine-Grained Access Control [FGAC] and Object-Level Access Control [OLAC] are supported in each of the configurations:

Obtain installation script

Obtain the account unique <privacera-plugin-script-download-url>. This script and other commands run in your Spark command shell to complete the PrivaceraCloud installation.

Steps:

  1. Go to Settings > API Key.

  2. Use an existing active API Key or generate a new one.

    Note

    Make sure the Expiry column is set to Never Expires.

  3. Click the i icon to get the scripts.

  4. On the Plugins Setup Script, click the COPY URL button. Save this value on your Spark server. It is needed as the <privacera-plugin-script-download-url> in the next step.

Configure Privacera Plugin on local/virtual machine
OLAC Setup
  1. OLAC is supported only with JWT token authentication.

    See Data access methods.

  2. Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.

    privacera.jwt.oauth.enable=true
    privacera.jwt.0.token.issuer=<PLEASE_CHANGE>
    privacera.jwt.0.token.subject=<PLEASE_CHANGE>
    privacera.jwt.0.token.secret=<PLEASE_CHANGE>
    privacera.jwt.0.token.publickey=<PLEASE_CHANGE>
    privacera.jwt.0.token.userKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.groupKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
    

    Property

    Description

    Example

    privacera.jwt.oauth.enable

    Property to enable JWT auth in Privacera services.

    true

    privacera.jwt.{index}.token.issuer

    Property to enter the URL of the identity provider.

    https://you-idp-domain.com

    privacera.jwt.{index}.token.publickey

    The JWT token public key in String format (Need to delete all newlines).

    -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----

    privacera.jwt.{index}.token.secret

    [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.

    privacera-api

    privacera.jwt.{index}.token.subject

    [Optional] Add this If JWT Token has a subject.

    api-token

    privacera.jwt.{index}.token.userKey

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    client-id

    privacera.jwt.{index}.token.groupKey

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    scope

    privacera.jwt.{index}.token.parser.type

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    PING_IDENTITY: When groupKey is an array

    KEYCLOAK: When groupKey is space separator

    KEYCLOAK

    After adding the properties, run the Dataserver, and then proceed to the next step.

  3. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  4. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
  5. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
    

    Add the following properties:

    PLUGIN_TYPE="spark"
    SPARK_PLUGIN_TYPE="OLAC"
    SPARK_HOME="<PLEASE_CHANGE>"
    SPARK_CLUSTER_NAME="privacera-spark"
    

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type OLAC. JWT Authentication will be enabled by default.

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark.

    SPARK_CLUSTER_NAME

    Cluster Name which will show up in the Privacera Ranger Audits page.

  6. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh
    

    The script will set up the Privacera Plugin in the OLAC mode.

FGAC Setup
  1. FGAC is recommended to be used with JWT authentication enabled.

    Note

    If JWT authentication is disabled, access control will fall on the system user or proxy user.

  2. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  3. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
    
  4. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
    

    Add the following properties:

    PLUGIN_TYPE="spark"
    SPARK_PLUGIN_TYPE="FGAC"
    SPARK_HOME="<PLEASE_CHANGE>"
    SPARK_CLUSTER_NAME="privacera-spark"
    

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type FGAC.

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark.

    SPARK_CLUSTER_NAME

    Cluster Name which will show up in the Privacera Ranger Audits page.

    Add the following properties when JWT auth is enabled:

    JWT_OAUTH_ENABLE="true"
    JWT_ISSUER="<PLEASE_CHANGE>"
    JWT_PUBLIC_KEY="<PLEASE_CHANGE>"
    #JWT_SECRET="<PLEASE_CHANGE>"
    #JWT_SUBJECT="<PLEASE_CHANGE>"
    JWT_USERKEY="<PLEASE_CHANGE>"
    JWT_GROUPKEY="<PLEASE_CHANGE>"
    JWT_PARSER_TYPE="<PLEASE_CHANGE>"
    

    Note

    To configure multiple JWTs, refer to FGAC with multiple JWT configurations below.

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    To enable JWT authentication.

    JWT_OAUTH_ENABLE="true"

    JWT_ISSUER

    The URL of the identity provider.

    JWT_ISSUER="https://your-idp-domain.com"

    JWT_PUBLIC_KEY

    The JWT token public key in String format.

    JWT_SECRET

    Uncomment and add value if the JWT token has been encrypted using secret.

    JWT_SECRET="privacera-secret"

    JWT_SUBJECT

    Uncomment and add value if JWT Token has a subject.

    JWT_SUBJECT="api-token"

    JWT_USERKEY

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    JWT_USERKEY="client_id"

    JWT_GROUPKEY

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    JWT_GROUPKEY="scope"

    JWT_PARSER_TYPE

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    JWT_PARSER_TYPE="KEYCLOAK"

  5. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh
    

    The script will set up the Privacera Plugin in the FGAC mode.

FGAC with multiple JWT configurations

To configure multiple JWT configurations add the below index based properties in the privacera_env.sh file. In which {index} start from 0 to n.

JWT_OAUTH_ENABLE="true"

JWT_{index}_ISSUER="<PLEASE_CHANGE>"
JWT_{index}_PUBLICKEY="<PLEASE_CHANGE>"
JWT_{index}_SUBJECT="<PLEASE_CHANGE>"
JWT_{index}_SECRET="<PLEASE_CHANGE>"
JWT_{index}_USERKEY="<PLEASE_CHANGE>"
JWT_{index}_GROUPKEY="<PLEASE_CHANGE>"
JWT_{index}_PARSER_TYPE="<PLEASE_CHANGE>"

For example, for two configurations: (starts at 0)

JWT_OAUTH_ENABLE="true"

JWT_0_ISSUER="https://mydomain.com/issuer"
JWT_0_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
JWT_0_SUBJECT=”principal1”
JWT_0_SECRET=”shkl-XXXX-XXXX-XXXX”
JWT_0_USERKEY="client_id"
JWT_0_GROUPKEY="scope"
JWT_0_PARSER_TYPE="PING_IDENTITY"

JWT_1_ISSUER="https://mydomain.com/issuer"
JWT_1_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
JWT_1_SUBJECT=”principal2”
JWT_1_SECRET=”suhjk-XXXX-XXXX-XXXX”
JWT_1_USERKEY="client_id"
JWT_1_GROUPKEY="scope"
JWT_1_PARSER_TYPE="KEYCLOAK"
Configure Privacera Plugin in an Existing Docker File

If you have an existing Open Source Spark setup running on Kubernetes, you can update your existing Docker file used to create Spark image to add steps for installing Privacera Plugin.

OLAC Setup
  1. OLAC is supported only with JWT token authentication.

    Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist.

    See Data access methods.

  2. Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.

    privacera.jwt.oauth.enable=true
    privacera.jwt.0.token.issuer=<PLEASE_CHANGE>
    privacera.jwt.0.token.subject=<PLEASE_CHANGE>
    privacera.jwt.0.token.secret=<PLEASE_CHANGE>
    privacera.jwt.0.token.publickey=<PLEASE_CHANGE>
    privacera.jwt.0.token.userKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.groupKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
    

    Property

    Description

    Example

    privacera.jwt.oauth.enable

    Property to enable JWT auth in Privacera services.

    true

    privacera.jwt.{index}.token.issuer

    Property to enter the URL of the identity provider.

    https://you-idp-domain.com

    privacera.jwt.{index}.token.publickey

    The JWT token public key in String format (Need to delete all newlines).

    -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----

    privacera.jwt.{index}.token.secret

    [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.

    privacera-api

    privacera.jwt.{index}.token.subject

    [Optional] Add this If JWT Token has a subject.

    api-token

    privacera.jwt.{index}.token.userKey

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    client-id

    privacera.jwt.{index}.token.groupKey

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    scope

    privacera.jwt.{index}.token.parser.type

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    PING_IDENTITY: When groupKey is an array

    KEYCLOAK: When groupKey is space separator

    KEYCLOAK

    After adding the properties, run the Dataserver, and then proceed to the next step.

  3. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  4. Copy the following to your Docker file. Set the PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL property.

    ######## Install Privacera Spark Plugin Start ###########
    
    # ENV SPARK_HOME /opt/apache/spark
    RUN apt-get -y install zip unzip wget
    ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>"
    ENV PLUGIN_TYPE="spark"
    ENV SPARK_PLUGIN_TYPE="OLAC"
    ENV SPARK_CLUSTER_NAME="privacera-spark"
    RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL"
    RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh
    RUN chmod +x privacera_plugin.sh
    RUN ./privacera_plugin.sh
    
    ######## Install Privacera Spark Plugin End ###########
    
  5. Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.

FGAC Setup
  1. FGAC is recommended to be used with JWT authentication enabled.

    Note

    If JWT authentication is disabled, access control will fall on the system user or proxy user.

  2. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  3. Copy the following to your Docker file. Set the PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL property. For the JWT properties, refer the table below.

    ######## Install Privacera Spark Plugin Start ###########
    
    # ENV SPARK_HOME /opt/apache/spark
    RUN apt-get -y install zip unzip wget
    ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>"
    ENV PLUGIN_TYPE="spark"
    ENV SPARK_PLUGIN_TYPE="FGAC"
    ENV SPARK_CLUSTER_NAME="privacera-spark"
    ENV JWT_OAUTH_ENABLE "true"
    ENV JWT_ISSUER=<PLEASE_CHANGE>
    ENV JWT_PUBLIC_KEY=<PLEASE_CHANGE>
    ENV JWT_SECRET=<PLEASE_CHANGE>
    ENV JWT_SUBJECT=<PLEASE_CHANGE>
    ENV JWT_USERKEY=<PLEASE_CHANGE>
    ENV JWT_GROUPKEY=<PLEASE_CHANGE>
    ENV JWT_PARSER_TYPE=<PLEASE_CHANGE>
    RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL"
    RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh
    RUN chmod +x privacera_plugin.sh
    RUN ./privacera_plugin.sh
    
    ######## Install Privacera Spark Plugin End ###########
    

    Note

    To configure multiple JWTs, refer to FGAC with Multiple JWT Configuration in an Existing Docker File below.

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    To enable JWT authentication.

    JWT_OAUTH_ENABLE="true"

    JWT_ISSUER

    The URL of the identity provider.

    JWT_ISSUER="https://your-idp-domain.com"

    JWT_PUBLIC_KEY

    The JWT token public key in String format.

    JWT_SECRET

    Uncomment and add value if the JWT token has been encrypted using secret.

    JWT_SECRET="privacera-secret"

    JWT_SUBJECT

    Uncomment and add value if JWT Token has a subject.

    JWT_SUBJECT="api-token"

    JWT_USERKEY

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    JWT_USERKEY="client_id"

    JWT_GROUPKEY

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    JWT_GROUPKEY="scope"

    JWT_PARSER_TYPE

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    JWT_PARSER_TYPE="KEYCLOAK"

  4. Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.

FGAC with Multiple JWT Configuration in an Existing Docker File

To configure multiple JWT configurations add the below index based Environment variable in the Docker file. In which {index} start from 0 to n.

ENV JWT_OAUTH_ENABLE "true"
ENV JWT_{index}_ISSUER="<PLEASE_CHANGE>"
ENV JWT_{index}_PUBLICKEY="<PLEASE_CHANGE>"
ENV JWT_{index}_SUBJECT="<PLEASE_CHANGE>"
ENV JWT_{index}_SECRET="<PLEASE_CHANGE>"
ENV JWT_{index}_USERKEY="<PLEASE_CHANGE>"
ENV JWT_{index}_GROUPKEY="<PLEASE_CHANGE>"
ENV JWT_{index}_PARSER_TYPE="<PLEASE_CHANGE>"

For example, for two configurations: (starts at 0)

######## Install Privacera Spark Plugin Start ############ 
ENV SPARK_HOME /opt/apache/spark
RUN apt-get -y install zip unzip wget
ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>"
ENV PLUGIN_TYPE="spark"
ENV SPARK_PLUGIN_TYPE="FGAC"
ENV SPARK_CLUSTER_NAME="privacera-spark"

ENV JWT_OAUTH_ENABLE "true"
ENV JWT_0_ISSUER="https://mydomain.com/issuer"
ENV JWT_0_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
ENV JWT_0_SUBJECT=”principal1”
ENV JWT_0_SECRET=”shkl-XXXX-XXXX-XXXX”
ENV JWT_0_USERKEY="client_id"
ENV JWT_0_GROUPKEY="scope"
ENV JWT_0_PARSER_TYPE="PING_IDENTITY"

ENV JWT_1_ISSUER="https://mydomain.com/issuer"
ENV JWT_1_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
ENV JWT_1_SUBJECT=”principal2”
ENV JWT_1_SECRET=”suhjk-XXXX-XXXX-XXXX”
ENV JWT_1_USERKEY="client_id"
ENV JWT_1_GROUPKEY="scope"
ENV JWT_1_PARSER_TYPE="KEYCLOAK"
Configure Privacera Plugin using Privacera Scripts

The scripts will help you create an Open Source Spark image with Privacera Plugin and push it to the specified Docker Hub which can be used to run Spark with Privacera.

OLAC Setup
  1. OLAC is supported only with JWT token authentication.

    Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist.

    See Data access methods.

  2. Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.

    privacera.jwt.oauth.enable=true
    privacera.jwt.0.token.issuer=<PLEASE_CHANGE>
    privacera.jwt.0.token.subject=<PLEASE_CHANGE>
    privacera.jwt.0.token.secret=<PLEASE_CHANGE>
    privacera.jwt.0.token.publickey=<PLEASE_CHANGE>
    privacera.jwt.0.token.userKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.groupKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
    
    

    Property

    Description

    Example

    privacera.jwt.oauth.enable

    Property to enable JWT auth in Privacera services.

    true

    privacera.jwt.{index}.token.issuer

    Property to enter the URL of the identity provider.

    https://you-idp-domain.com

    privacera.jwt.{index}.token.publickey

    The JWT token public key in String format (Need to delete all newlines).

    -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----

    privacera.jwt.{index}.token.secret

    [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.

    privacera-api

    privacera.jwt.{index}.token.subject

    [Optional] Add this If JWT Token has a subject.

    api-token

    privacera.jwt.{index}.token.userKey

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    client-id

    privacera.jwt.{index}.token.groupKey

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    scope

    privacera.jwt.{index}.token.parser.type

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    PING_IDENTITY: When groupKey is an array

    KEYCLOAK: When groupKey is space separator

    privacera.jwt.token.parser.type=KEYCLOAK

    After adding the properties, run the Dataserver, and then proceed to the next step.

  3. SSH to the instance where you want to install Privacera Plugin.

  4. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
    
  5. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
    

    Add the following properties:

    PLUGIN_TYPE="spark_k8s"
    SPARK_PLUGIN_TYPE="OLAC"
    HUB="<PLEASE_CHANGE>"
    HUB_USERNAME="<PLEASE_CHANGE>"
    HUB_PASSWORD="<PLEASE_CHANGE>"
    ENV_TAG="<PLEASE_CHANGE>"
    

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type OLAC. JWT Authentication will be enabled by default.

    HUB

    The Docker hub URL where you want the image to be pushed.

    HUB_USERNAME

    Docker hub username.

    HUB_PASSWORD

    Docker hub password.

    ENV_TAG

    Docker image tag.

  6. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh
    

    The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.

FGAC Setup
  1. FGAC is recommended to be used with JWT authentication enabled.

    Note

    If JWT authentication is disabled, access control will fall on the system user or proxy user.

  2. SSH to the instance where you want to install Privacera Plugin.

  3. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
    
  4. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
    

    Add the following properties:

    PLUGIN_TYPE="spark_k8s"
    SPARK_PLUGIN_TYPE="FGAC"
    SPARK_HOME="<PLEASE_CHANGE>"
    SPARK_CLUSTER_NAME="privacera-spark"
    

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type FGAC.

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark.

    SPARK_CLUSTER_NAME

    Cluster Name which will show up in the Privacera Ranger Audits page.

    Add the following properties when JWT auth is enabled:

    JWT_OAUTH_ENABLE="true"
    JWT_ISSUER="<PLEASE_CHANGE>"
    JWT_PUBLIC_KEY="<PLEASE_CHANGE>"
    #JWT_SECRET="<PLEASE_CHANGE>"
    #JWT_SUBJECT="<PLEASE_CHANGE>"
    JWT_USERKEY="<PLEASE_CHANGE>"
    JWT_GROUPKEY="<PLEASE_CHANGE>"
    JWT_PARSER_TYPE="<PLEASE_CHANGE>"
    

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    To enable JWT authentication.

    JWT_OAUTH_ENABLE="true"

    JWT_ISSUER

    The URL of the identity provider.

    JWT_ISSUER="https://your-idp-domain.com"

    JWT_PUBLIC_KEY

    The JWT token public key in String format.

    JWT_SECRET

    Uncomment and add value if the JWT token has been encrypted using secret.

    JWT_SECRET="privacera-secret"

    JWT_SUBJECT

    Uncomment and add value if JWT Token has a subject.

    JWT_SUBJECT="api-token"

    JWT_USERKEY

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    JWT_USERKEY="client_id"

    JWT_GROUPKEY

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    JWT_GROUPKEY="scope"

    JWT_PARSER_TYPE

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    JWT_PARSER_TYPE="KEYCLOAK"

    Add the following Docker Hub properties:

    HUB="<PLEASE_CHANGE>"
    HUB_USERNAME="<PLEASE_CHANGE>"
    HUB_PASSWORD="<PLEASE_CHANGE>"
    ENV_TAG="<PLEASE_CHANGE>"
    

    Property

    Description

    HUB

    The Docker hub URL where you want the image to be pushed.

    HUB_USERNAME

    Docker hub username.

    HUB_PASSWORD

    Docker hub password.

    ENV_TAG

    Docker image tag.

  5. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh
    

    The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.

Deploy Spark on EKS Cluster
  1. SSH to the instance where you want to deploy Spark on the EKS cluster.

  2. Get the Privacera Plugin download URL and set it in the following property. See Obtain installation script.

    export PRIVACERA_DOWNLOAD_URL="<PLEASE_CHANGE>"
    
  3. Create spark-k8s-artifacts folder.

    mkdir ~/privacera/spark-k8s-artifacts
    cd ~/privacera/spark-k8s-artifacts
    
  4. Download and extract packages.

    wget ${PRIVACERA_DOWNLOAD_URL}/plugin/spark/k8s-spark-deploy.tar.gz -O k8s-spark-deploy.tar.gz
    tar xzf k8s-spark-deploy.tar.gz
    rm -r k8s-spark-deploy.tar.gz
    cd k8s-spark-deploy/
    
  5. Open penv.sh file and substitute the values of the following properties. Refer to the table below:

    Property

    Description

    Example

    SPARK_NAME_SPACE

    Kubernetes namespace

    privacera-spark-plugin-test

    SPARK_PLUGIN_IMAGE

    Docker image with hub

    ${HUB}/privacera-spark-plugin:${ENV_TAG}

    SPARK_DOCKER_PULL_SECRET

    Secret for docker-registry

    spark-plugin-docker-hub

    SPARK_PLUGIN_ROLE_BINDING

    Spark role Binding

    privacera-sa-spark-plugin-role-binding

    SPARK_PLUGIN_SERVICE_ACCOUNT

    Spark services account

    privacera-sa-spark-plugin

    SPARK_PLUGN_ROLE

    Spark services account role

    privacera-sa-spark-plugin-role

    SPARK_PLUGIN_APP_NAME

    Spark plugin application name

    privacera-spark-examples

  6. Run the following command to replace the property values in EKS deployment YAML file.

    mkdir -p backup
    cp *.yml backup/
    ./replace.sh
    
  7. Run the following command to create EKS resources.

    kubectl apply -f namespace.yml 
    kubectl apply -f service-account.yml 
    kubectl apply -f role.yml
    kubectl apply -f role-binding.yml
    
  8. Run the following command to create secret for docker-registry.

    kubectl create secret docker-registry spark-plugin-docker-hub --docker-server=<PLEASE_CHANGE> --docker-username=<PLEASE_CHANGE>  --docker-password='<PLEASE_CHANGE>' --namespace=<PLEASE_CHANGE>
    
  9. Run the following command to deploy a sample Spark application. Replace ${SPARK_NAME_SPACE} with the Kubernetes namespace.

    kubectl apply -f privacera-spark-examples.yml -n ${SPARK_NAME_SPACE}
    

    Note

    This is a sample file used for deployment. As per your use case, you can create a Spark deployment file and deploy a Docker image.

    This will deploy a Spark application in EKS pod with Privacera plugin and it will keep the pod running, so that you can use it in interactive mode.

Oracle for Discovery
 

This topic describes how to connect Oracle application to the PrivaceraCloud Discovery service.

Prerequisites

Before connecting the Oracle application, make sure you have the following information available:

  • JDBC URL

  • JDBC Username 

  • JDBC Password 

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select Oracle.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable Data Discovery for Oracle.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.

  5. In the BASIC tab, enter the values in the following fields:

    • JDBC URL

    • JDBC Username

    • JDBC Password

  6. In the ADVANCED tab, you can add custom properties.

  7. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  8. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

Add data source

To add a resources using this connection as Discovery targets, see Privacera Discovery scan targets

PostgreSQL

This topic describes how to connect PostgreSQL application to PrivaceraCloud.

Prerequisites
  • Create a database in PostgreSQL and get the database name and its URL:

  • Create a database user granting all privileges to fully access the database, and then get the user credentials to connect to the database.

If you choose to enable audits for PolicySync, ensure the following prerequisites are met:

Connect application
  1. Go the Setting > Applications.

  2. In the Applications screen, select PostgreSQL.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable Access Management for PostgreSQL.

  5. In the BASIC tab, enter the values in the given fields and click Save. For property details and description, see table below:

    Note

    Make sure that the other properties are advanced and should be modified in consultation with Privacera.

    Basic fields

    Table 14. Basic fields

    Field name

    Type

    Default

    Required

    Description

    Postgres JDBC URL

    string

    Yes

    Specifies the JDBC URL for the PostgreSQL connector.

    Use the following format for the JDBC string:

    jdbc:postgresql://<PG_SERVER_HOST>:<PG_SERVER_PORT>
    
    

    Postgres jdbc username

    string

    Yes

    Specifies the JDBC username to use.

    Postgres jdbc password

    string

    Yes

    Specifies the JDBC password to use.

    Postgres default database

    string

    privacera_db

    Yes

    Specifies the name of the JDBC database to use.

    Default password for new postgres user

    string

    Yes

    Specifies the password to use when PolicySync creates new users.

    Postgres resource owner

    string

    No

    Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.

    • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

    • If a value is specified, the owner of the resource will be changed to the specified value.

    The following resource types are supported:

    • Database

    • Schemas

    • Tables

    • Views

    Databases to set access control policies

    string

    No

    Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of databases might resemble the following: testdb1,testdb2,sales db*.

    If specified, Databases to ignore while setting access control policies takes precedence over this setting.

    Enable policy enforcements and user/group/role management

    boolean

    true

    No

    Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

    Enable access audits

    boolean

    false

    Yes

    Specifies whether Privacera fetches access audit data from the data source.

    Audit source for postgres

    string

    sqs

    No

    Specifies the source for audit information. The following values are supported:

    • sqs

    • gcp_pgaudit

    The default value is: sqs

    AWS access key to connect to sqs queue for access audits

    string

    No

    Specifies the Amazon Web Services (AWS) access key that PolicySync uses to create an IAM client role to access the SQS queue to retrieve access audit information.

    Specify this only if your deployment machine lacks an IAM role with the necessary permissions.

    AWS secret access key to connect to sqs queue for access audits

    string

    No

    Specifies the Amazon Web Services (AWS) secret key that PolicySync uses to create an IAM client role to access the SQS queue to retrieve access audit information.

    Specify this only if your deployment machine lacks an IAM role with the necessary permissions.

    AWS region of sqs queue

    string

    POSTGRES_AUDIT_SQS_QUEUE_REGION

    No

    Specifies the Amazon Web Services (AWS) SQS queue region.

    AWS sqs queue name

    string

    POSTGRES_AUDIT_SQS_QUEUE_NAME

    No

    Specifies the Amazon Web Services (AWS) SQS queue name that PolicySync uses to retrieve access audit information.

    GCP CloudSQL postgres instance id

    string

    No

    Specifies the Google Cloud Platform SQL instance ID for the PostgreSQL server. PolicySync uses this instance ID for retrieving access audit information.

    The instance ID must be provided in the following formation:

    <PROJECT_ID>:<DB_INSTANCE_ID>
    
    


    Advanced fields

    Table 15. Advanced fields

    Field name

    Type

    Default

    Required

    Description

    Schemas to set access control policies

    string

    No

    Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    Use the following format when specifying a schema:

    <DATABASE_NAME>.<SCHEMA_NAME>

    If specified, Schemas to ignore while setting access control policies takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all schemas are managed:

    <DATABASE_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all schemas.

    • If set to none no schemas are managed.

    Tables to set access control policies

    string

    No

    Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    Use the following format when specifying a table:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    If specified, ignore.table.list takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all matched tables are managed:

    <DATABASE_NAME>.<SCHEMA_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all tables.

    • If set to none no tables are managed.

    Databases to ignore while setting access control policies

    string

    No

    Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.

    For example:

    testdb1,testdb2,sales_db*
    
    

    This setting supersedes any values specified by Databases to set access control policies.

    Schemas to ignore while setting access control policies

    string

    No

    Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all schemas are subject to access control.

    For example:

    testdb1.schema1,testdb2.schema2,sales_db*.sales*
    
    

    This setting supersedes any values specified by Schemas to set access control policies.

    Regex to find special characters in user names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in user names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in group names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in group names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in role names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in role names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.

    If not specified, no find and replace operation is performed.

    Persist case sensitivity of user names

    boolean

    false

    No

    Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

    Persist case sensitivity of group names

    boolean

    false

    No

    Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

    Persist case sensitivity of role names

    boolean

    false

    No

    Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to true, case sensitivity is preserved.

    Create users in postgres by policysync

    boolean

    true

    No

    Specifies whether PolicySync creates local users for each user in Privacera.

    Create user roles in postgres by policysync

    boolean

    true

    No

    Specifies whether PolicySync creates local roles for each user in Privacera.

    Manage users from portal

    boolean

    true

    No

    Specifies whether PolicySync maintains user membership in roles in the PostgreSQL data source.

    Manage groups from portal

    boolean

    true

    No

    Specifies whether PolicySync creates groups from Privacera in the PostgreSQL data source.

    Manage roles from portal

    boolean

    true

    No

    Specifies whether PolicySync creates roles from Privacera in the PostgreSQL data source.

    Users to set access control policies

    string

    No

    Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    If not specified, PolicySync manages access control for all users.

    If specified, Users to be ignored by access control policies takes precedence over this setting.

    An example user list might resemble the following: user1,user2,dev_user*.

    Groups to set access control policies

    string

    No

    Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of projects might resemble the following: group1,group2,dev_group*.

    If specified, Groups be ignored by access control policies takes precedence over this setting.

    Roles to set access control policies

    string

    No

    Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of projects might resemble the following: role1,role2,dev_role*.

    If specified, Roles be ignored by access control policies takes precedence over this setting.

    Users to be ignored by access control policies

    string

    No

    Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

    This setting supersedes any values specified by Users to set access control policies.

    Groups be ignored by access control policies

    string

    No

    Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

    This setting supersedes any values specified by Groups to set access control policies.

    Roles be ignored by access control policies

    string

    No

    Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.

    This setting supersedes any values specified by Roles to set access control policies.

    Prefix of postgres roles for portal users

    string

    priv_user_

    No

    Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named <USER> defined in Privacera and the role prefix is priv_user_, the local role is named priv_user_<USER>.

    Prefix of postgres roles for portal groups

    string

    priv_group_

    No

    Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

    Prefix of postgres roles for portal roles

    string

    priv_role_

    No

    Specifies the prefix that PolicySync uses when creating roles from Privacera in the PostgreSQL data source.

    For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

    Use postgres native public group for public group access policies

    boolean

    true

    No

    Specifies whether PolicySync uses the PostgreSQL native public group for access grants whenever a policy refers to a public group. The default value is true.

    Set access control policies only on the users from managed groups

    boolean

    false

    No

    Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.

    Set access control policies only on the users/groups from managed roles

    boolean

    false

    No

    Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.

    Enforce postgres native row filter

    boolean

    false

    No

    Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.

    Enforce masking policies using secure views

    boolean

    true

    No

    Specifies whether to use secure view based masking. The default value is true.

    Because PolicySync does not support native masking for PostgreSQL, enabling this setting is recommended.

    Enforce row filter policies using secure views

    boolean

    true

    No

    Specifies whether to use secure view based row filtering. The default value is true.

    While PostgreSQL supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

    Create secure view for all tables/views

    boolean

    true

    No

    Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

    Default masked value for numeric datatype columns

    integer

    0

    No

    Specifies the default masking value for numeric column types.

    Default masked value for text/varchar datatype columns

    string

    <MASKED>

    No

    Specifies the default masking value for text and string column types.

    Secure view name prefix

    string

    No

    Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

    Secure view name postfix

    string

    _secure

    No

    Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

    Secure view schema name prefix

    string

    No

    Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view schema name for a schema named example1 is dev_example1.

    Secure view schema name postfix

    string

    No

    Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a schema named example1 is example1_dev.

    Enable dataadmin

    boolean

    true

    No

    This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

    Users to exclude when fetching access audits

    string

    POSTGRES_JDBC_USERNAME

    No

    Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".



    Custom fields

    Table 16. Custom fields

    Canonical name

    Type

    Default

    Description

    load.resources

    string

    load_from_database_columns

    Specifies how PolicySync loads resources from PostgreSQL. The following values are allowed:

    • load_md: Load resources from PostgreSQL with a top-down resources approach, that is, it first loads the databases and then the schemas followed by tables and its columns.

    • load_from_database_columns: Load resources one by one for each resource type that is, it loads all databases first, then it loads all schemas in all databases, followed by all tables in all schemas and its columns. This mode is recommended since it is faster than the load mode.

    sync.interval.sec

    integer

    60

    Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

    sync.serviceuser.interval.sec

    integer

    420

    Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

    sync.servicepolicy.interval.sec

    integer

    540

    Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

    audit.interval.sec

    integer

    30

    Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

    ignore.table.list

    string

    Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    This setting supersedes any values specified by Tables to set access control policies.

    user.name.case.conversion

    string

    lower

    Specifies how user name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of user names is set to true.

    group.name.case.conversion

    string

    lower

    Specifies how group name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of group names is set to true.

    role.name.case.conversion

    string

    lower

    Specifies how role name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of role names is set to true.

    policy.name.separator

    string

    _priv_

    Specifies a string to use as part of the name of native row filter and masking policies.

    row.filter.policy.name.template

    string

    {database}{separator}{schema}{separator}{table}

    Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table data from the schema schema that resides in the db database, the row filter policy name might resemble the following:

    db_priv_schema_priv_data_<ROW_FILTER_ITEM_NUMBER>
    
    

    secure.view.name.remove.suffix.list

    string

    Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    secure.view.schema.name.remove.suffix.list

    string

    Specifies a suffix to remove from a schema name. For example, if a schema is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    perform.grant.updates.max.retry.attempts

    integer

    2

    Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is 2.

    aws.sqs.queue.endpoint

    string

    Specifies the SQS endpoint URL on Amazon Web Services (AWS). You must specify this value if you use a private VPC in your AWS account that is not available on the Internet.

    aws.sqs.queue.max.poll.messages

    integer

    100

    Specifies the number of messages to retrieve from the SQS queue at one time for audit information.



  6. In the ADVANCED tab, you can add custom properties.

  7. Using the IMPORT PROPERTIES button, you can browse and import application properties.

Accessing PostgreSQL Audits in GCP

Prerequisites

Ensure the following prerequisites are met:

Configuration

  1. In GCP:

    1. Run the following commands on Google Cloud's shell (gcloud) by providing GCP_PROJECT_ID and INSTANCE_NAME.

      gcloud sql instances patch  {INSTANCE_NAME} --database-flags=cloudsql.enable_pgaudit=on,pgaudit.log=all --project {GCP_PROJECT_ID}
      
    2. Run a SQL command using a compatible psql client to create the pgAudit extension.

      CREATE EXTENSION pgaudit;              
    3. Create a service account and private key JSON file, which will be used by PolicySync to pull access audits. See Setting up authentication and edit the following fields:

      • Service account name: Enter any user-defined name. For example, policysync-postgres-gcp-audit-service-account.

      • Select a role: Select Private Logs Viewer role.

      • Create new key: Create a service account key and download the JSON file in the custom-vars folder.

  2. In Privacera Manager:

    Add the following properties in vars.policysync.postgres.yml file:

    POSTGRES_AUDIT_SOURCE:"gcp_pgaudit"
    POSTGRES_GCP_AUDIT_SOURCE_INSTANCE_ID:""
    POSTGRES_OAUTH_PRIVATE_KEY_FILE_NAME:""
    
Configure AWS RDS PostgreSQL instance for access audits

You can configure your AWS account to allow Privacera to access your RDS PostgreSQL instance audit logs through Amazon cloudWatch logs. To enable this functionality, you must make the following changes in your account:

  • Update the AWS RDS parameter group for the database

  • Create an AWS SQS queue

  • Specify an AWS Lambda function

  • Create an IAM role for an EC2 instance

Update the AWS RDS parameter group for the database

To expose access audit logs, you must update configuration for the data source.

Procedure

  1. Log in to your AWS account.

  2. To create a role for audits, run the following SQL query with a user with administrative credentials for your data source:

    CREATE ROLE rds_pgaudit;
  3. Create a new parameter group for your database and specify the following values:

    • Parameter group family: Select a database from either the aurora-postgresql or postgres families.

    • Type: Select DB Parameter Group.

    • Group name: Specify a group name for the parameter group.

    • Description: Specify a description for the parameter group.

  4. Edit the parameter group that you created in the previous step and set the following values:

    • pgaudit.log: Specify all, overwriting any existing value.

    • shared_preload_libraries: Specify pg_stat_statements,pgaudit.

    • pgaudit.role: Specify rds_pgaudit.

  5. Associate the parameter group that you created with your database. Modify the configuration for the database instance and make the following changes:

    • DB parameter group: Specify the parameter group you created in this procedure.

    • PostgreSQL log: Ensure this option is set to enable logging to Amazon cloudWatch logs.

  6. When prompted, choose the option to immediately apply the changes you made in the previous step.

  7. Restart the database instance.

Verification

To verify that your database instance logs are available, complete the following steps:

  1. From the Amazon RDS console, View the logs for your database instance from the RDS console.

  2. From the CloudWatch console, complete the following steps:

    1. Find the /aws/rds/cluster/* log group that corresponds to your database instance.

    2. Click the log group name to confirm that a log stream exists for the database instance, and then click on a log stream name to confirm that log messages are present.

Create an AWS SQS queue

To create an SQS queue used by an AWS Lambda function that you will create later, complete the following steps.

  1. From the AWS console, create a new Amazon SQS queue with the default settings. Use the following format when specifying a value for the Name field:

    privacera-postgres-<RDS_CLUSTER_NAME>-audits

    where:

    • RDS_CLUSTER_NAME: Specifies the name of your RDS cluster.

  2. After the queue is created save the URL of the queue for use later.

Specify an AWS Lambda function

To create an AWS Lambda function to interact with the SQS queue, complete the following steps. In addition to creating the function, you must create a new IAM policy and associate a new IAM role with the function. You need to know your AWS account ID and AWS region to complete this procedure.

  1. From the IAM console, create a new IAM policy and input the following JSON:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "logs:CreateLogGroup",
                "Resource": "arn:aws:logs:<REGION>:<ACCOUNT_ID>:*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "logs:CreateLogStream",
                    "logs:PutLogEvents"
                ],
                "Resource": [
                    "arn:aws:logs:<REGION>:<ACCOUNT_ID>:log-group:/aws/lambda/<LAMBDA_FUNCTION_NAME>:*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": "sqs:SendMessage",
                "Resource": "arn:aws:sqs:<REGION>:<ACCOUNT_ID>:<SQS_QUEUE_NAME>"
            }
        ]
    }

    where:

    • REGION: Specify your AWS region.

    • ACCOUNT_ID: Specify your AWS account ID.

    • LAMBDA_FUNCTION_NAME: Specify the name of the AWS Lambda function, which you will create later. For example: privacera-postgres-cluster1-audits

    • SQS_QUEUE_NAME: Specify the name of the AWS SQS Queue.

  2. Specify a name for the IAM policy, such as privacera-postgres-audits-lambda-execution-policy, and then create the policy.

  3. From the IAM console, create a new IAM role and choose for the Use case the Lambda option.

  4. Search for the IAM policy that you just created with a name that might be similar to privacera-postgres-audits-lambda-execution-policy and select it.

  5. Specify a Role name for the IAM policy, such as privacera-postgres-audits-lambda-execution-role, and then create the role.

  6. From the AWS Lambda console, create a new function and specify the following fields:

    • Function name: Specify a name for the function, such as privacera-postgres-cluster1-audits.

    • Runtime: Select Node.js 12.x from the list.

    • Permissions: Select Use an existing role and choose the role created earlier in this procedure, such as privacera-postgres-audits-lambda-execution-role.

  7. Add a trigger to the function you created in the previous step and select CloudWatch Logs from the list, and then specify the following values:

    • Log group: Select the log group path for your Amazon RDS database instance, such as /aws/rds/cluster/database-1/postgresql.

    • Filter name: Specify auditTrigger.

  8. In the Lambda source code editor, provide the following JavaScript code in the index.js file, which is open by default in the editor:

    var zlib = require('zlib');
    
    // CloudWatch logs encoding
    var encoding = process.env.ENCODING || 'utf-8';  // default is utf-8
    var awsRegion = process.env.REGION || 'us-east-1';
    var sqsQueueURL = process.env.SQS_QUEUE_URL;
    var ignoreDatabase = process.env.IGNORE_DATABASE;
    var ignoreUsers = process.env.IGNORE_USERS;
    
    var ignoreDatabaseArray = ignoreDatabase.split(',');
    var ignoreUsersArray = ignoreUsers.split(',');
    
    // Import the AWS SDK
    const AWS = require('aws-sdk');
    
    // Configure the region
    AWS.config.update({region: awsRegion});
    
    exports.handler = function (event, context, callback) {
    
        var zippedInput = Buffer.from(event.awslogs.data, 'base64');
    
            zlib.gunzip(zippedInput, function (e, buffer) {
            if (e) {
                callback(e);
            }
    
            var awslogsData = JSON.parse(buffer.toString(encoding));
    
            // Create an SQS service object
            const sqs = new AWS.SQS({apiVersion: '2012-11-05'});
    
            console.log(awslogsData);
            if (awslogsData.messageType === 'DATA_MESSAGE') {
    
                // Chunk log events before posting
                awslogsData.logEvents.forEach(function (log) {
    
                    //// Remove any trailing \n
                    console.log(log.message)
    
                    // Checking if message falls under ignore users/database
                    var sendToSQS = true;
    
                    if(sendToSQS) {
    
                        for(var i = 0; i < ignoreDatabaseArray.length; i++) {
                           if(log.message.toLowerCase().indexOf("@" + ignoreDatabaseArray[i]) !== -1) {
                                sendToSQS = false;
                                break;
                           }
                        }
                    }
    
                    if(sendToSQS) {
    
                        for(var i = 0; i < ignoreUsersArray.length; i++) {
                           if(log.message.toLowerCase().indexOf(ignoreUsersArray[i] + "@") !== -1) {
                                sendToSQS = false;
                                break;
                           }
                        }
                    }
    
                    if(sendToSQS) {
                    
                        let sqsOrderData = {
                            MessageBody: JSON.stringify(log),
                            MessageDeduplicationId: log.id,
                            MessageGroupId: "Audits",
                            QueueUrl: sqsQueueURL
                        };
    
                        // Send the order data to the SQS queue
                        let sendSqsMessage = sqs.sendMessage(sqsOrderData).promise();
    
                        sendSqsMessage.then((data) => {
                            console.log("Sent to SQS");
                        }).catch((err) => {
                            console.log("Error in Sending to SQS = " + err);
                        });
    
                    }
                });
            }
        });
    };
  9. For the Lambda function, edit the environment variables and create the following environment variables:

    • REGION: Specify your AWS region.

    • SQS_QUEUE_URL: Specify your AWS SQS queue URL.

    • IGNORE_DATABASE: Specify privacera_db.

    • IGNORE_USERS: Specify your database administrative user, such as privacera.

Create an IAM role for an EC2 instance

To create an IAM role for the AWS EC2 instance where you installed Privacera so that Privacera can read the AWS SQS queue, complete the following steps:

  1. From the IAM console, create a new IAM policy and input the following JSON:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "sqs:DeleteMessage",
                    "sqs:GetQueueUrl",
                    "sqs:ListDeadLetterSourceQueues",
                    "sqs:ReceiveMessage",
                    "sqs:GetQueueAttributes"
                ],
                "Resource": "<SQS_QUEUE_ARN>"
            },
            {
                "Effect": "Allow",
                "Action": "sqs:ListQueues",
                "Resource": "*"
            }
        ]
    }
    

    where:

    • SQS_QUEUE_ARN: Specifies the AQS SQS Queue ARN identifier for the SQS Queue you created earlier.

  2. Specify a name for the IAM policy, such as postgres-audits-sqs-read-policy, and create the policy.

  3. Attach the IAM policy to the AWS EC2 instance where you installed Privacera.

Accessing Cross Account SQS Queue for PostgreSQL Audits

Prerequisites

Ensure the following prerequisites are met:

  • Access to AWS account with EC2 instance where Privacera Manager is configured.

  • Access to AWS account where SQS Queue is configured.

Configuration

  1. Get the ARN of the account where the EC2 instance is running.

    1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

    2. In the navigation pane, choose Instances.

    3. Search for your instance and select it.

    4. In the Security tab, click the link in the IAM Role.

      ec2_iam_arn.jpg
    5. Copy the ARN of the IAM Role.

  2. Get the ARN of the account where the SQS Queue instance is configured.

    1. Open the Amazon SQS console at https://console.aws.amazon.com/sqs/.

    2. From the left navigation pane, choose Queues. From the queue list, select the queue that you created.

    3. In the Details section, copy the ARN of the queue.

  3. Add the policy in the AWS SQS account to grant permissions to the AWS EC2 account.

    1. Open the Amazon SQS console at https://console.aws.amazon.com/sqs/.

    2. In the navigation pane, choose Queues.

    3. Choose a queue and choose Edit.

    4. Scroll to the Access policy section.

      sqs_queue_access_policy.jpg
    5. Add the access policy statements in the input box.

      {"Version":"2012-10-17","Id":"PolicyAllowSQS","Statement":[{"Sid":"StmtAllowSQS","Effect":"Allow","Principal":{"AWS":"${EC2_INSTANCE_ROLE_ARN}"},"Action":["sqs:DeleteMessage","sqs:GetQueueUrl","sqs:ListDeadLetterSourceQueues","sqs:ReceiveMessage","sqs:GetQueueAttributes"],"Resource":"${SQS_QUEUE_ARN}"}]}
    6. When you finish configuring the access policy, choose Save.

    7. After saving, copy the SQS queue URL in the Details section.

  4. Add the SQS queue URL.

    Run the following command.

    cd ~/privacera/privacera-manager/
    vi config/custom-vars/vars.policysync.postgres.yml

    Add the URL in the following property.

    POSTGRES_AUDIT_SQS_QUEUE_NAME:"${SQS_QUEUE_URL}"
Power BI

This topic describes how to connect a Power BIapplication to PrivaceraCloud.

Connect Application
  1. Go to Settings -> Applications.

  2. On the Applications screen, select Power BI.

  3. Enter the application Name and Description, and then click SAVE.

  4. Click the toggle button to enable Access Management for Power BI.

  5. In the BASIC tab, enter the values in the required(*) fields and click SAVE.

  6. In the ADVANCED tab, you can add custom properties.

    Caution

    Advanced properties should be modified in consultation with Privacera.

  7. Click the IMPORT PROPERTIES link to browse and import application properties.

Connector properties

Basic fields

Table 17. Basic fields

Field name

Type

Default

Required

Description

Power BI authenticated user

string

Yes

Specifies the authentication username. If you do not specify this value, you must specify a secret for Power BI application client secret.

Power BI authenticated user's password

string

Yes

Specifies the authentication password. If you do not specify this value, you must specify a secret for Power BI application client secret.

Power BI application tenant id

string

Yes

Specifies the tenant ID associated with your Microsoft Azure account.

Power BI application client id

string

Yes

Specifies the principal ID for authentication.

Power BI application client secret

string

Yes

Specifies a client secret for authentication.

If you do not specify this value, you must specify both Power BI authenticated user and Power BI authenticated user's password.

Workspaces to set access control policies

string

No

Specifies a comma-separated list of workspace names for which PolicySync manages access control. If unset, access control is managed for all workspaces. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of workspaces might resemble the following: demo1,demo2,sales*.

If specified, Workspaces to ignore while setting access control policies takes precedence over this setting.

Enable policy enforcements and user/group/role management

boolean

true

No

Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

Enable access audits

boolean

false

Yes

Specifies whether Privacera fetches access audit data from the data source.



Advanced fields

Table 18. Advanced fields

Field name

Type

Default

Required

Description

Workspaces to ignore while setting access control policies

string

No

Specifies a comma-separated list of workspace names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all workspaces are subject to access control.

This setting supersedes any values specified by Workspaces to set access control policies.

Regex to find special characters in user names

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.

If not specified, no find and replace operation is performed.

String to replace with the special characters found in user names

string

_

No

Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.

If not specified, no find and replace operation is performed.

Regex to find special characters in group names

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.

If not specified, no find and replace operation is performed.

String to replace with the special characters found in group names

string

_

No

Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.

If not specified, no find and replace operation is performed.

Persist case sensitivity of user names

boolean

false

No

Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

Persist case sensitivity of group names

boolean

false

No

Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

Users to set access control policies

string

No

Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

If not specified, PolicySync manages access control for all users.

If specified, Users to be ignored by access control policies takes precedence over this setting.

An example user list might resemble the following: user1,user2,dev_user*.

Groups to set access control policies

string

No

Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: group1,group2,dev_group*.

If specified, Groups be ignored by access control policies takes precedence over this setting.

Users to be ignored by access control policies

string

No

Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

This setting supersedes any values specified by Users to set access control policies.

Groups be ignored by access control policies

string

No

Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

This setting supersedes any values specified by Groups to set access control policies.

Set access control policies only on the users from managed groups

boolean

false

No

Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.



Custom fields

Table 19. Custom fields

Canonical name

Type

Default

Description

sync.interval.sec

integer

60

Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

sync.serviceuser.interval.sec

integer

420

Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

sync.servicepolicy.interval.sec

integer

540

Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

audit.interval.sec

integer

30

Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

user.filter.with.email

boolean

false

Set this property to true if you only want to manage users who have an email address associated with them in the portal.

audit.initial.pull.min

integer

30

Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Microsoft Power BI.



Presto

This topic describes how to connect the Presto application to PrivaceraCloud and how PrivaceraCloud integrates with your Qubole Presto cluster using a plug-In.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Presto.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management and Privacera Discovery with the toggle buttons.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.

Enable Privacera Access Management

You only need to enable Privacera Access Management to start controlling access on Presto.

  1. Click the toggle button to enable Privacera Access Management for your application.

    You will see this message: Save the setting to start controlling access on Presto.

  2. Click Save.

Enable Data Discovery

Click the toggle button to enable Data Discovery for your application.

  1. On the BASIC tab, enter values in the following fields.

    • JDBC URL

    • JDBC Username 

    • JDBC Password

  2. On the ADVANCED tab, you can add custom properties.

  3. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  4. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

    To add a resources using this connection as Privacera Discovery targets, see Discovery scan targets.

Connect Presto on Qubole cluster PrivaceraCloud

PrivaceraCloud uses a Plug-in to integrate with your Qubole Presto cluster.

Connecting your Qubole Presto cluster to PrivaceraCloud consists of the following steps:

  • Create a service user on PrivaceraCloud for data user access control call-in from Presto to PrivaceraCloud.

  • Create, or identify and use an existing, unique call-in authentication (access control) and audit URLs from your Qubole Presto cluster to PrivaceraCloud.

  • Configure your Qubole Presto cluster to first load the necessary Privacera hosted Apache Ranger Plug-in components (on boot), and execute the call-in for access control and audit.

PrivaceraCloud steps
  1. Create a new data access service user for interaction with Qubole.

    1. Open Access Manager: Users/Groups/Roles and Click + Add.

    2. Create a new service data access user. Assign it to an Admin role. Record the User Name and Password.

    These are referred to as ADMIN_ROLE_USER and ADMIN_ROLE_PASSWORD in the following steps and will be substituted in configuration properties.

  2. Obtain API Key associated Ranger URLs for call back from Qubole cluster to Privacera.

    1. Open Settings: Api Key.

    2. You can use an existing Active API Key or create a new one. Expiry = Never Expires is recommended. To generate new API key, see API Key.

    3. Click the i icon to see the API Key Info.

    4. Copy and store the values for each of the Ranger Admin URL and Ranger Audit URL. These will be referenced as RANGER_ADMIN_URL and RANGER_AUDIT_URL in the following steps.

Presto Qubole console steps
  1. Open or create a new Presto cluster.

  2. Proceed to Advanced Configuration.

  3. In the PRESTO SETTINGS > Override Presto Configuration text box, add the following information. Substitute values obtained above for ADMIN_ROLE_USER, ADMIN_ROLE_PASSWORD, RANGER_ADMIN_URL, and RANGER_AUDIT_URL.

    bootstrap.properties:
     mkdir -p /media/ephemeral0/rangerssl/
     hadoop credential create sslTrustStore -value changeit -provider localjceks://file/media/ephemeral0/rangerssl/ranger.jceks
     chmod a+r /media/ephemeral0/rangerssl/ranger.jceks
     wget https://privacera-public1.s3.amazonaws.com/0001-httpcore-4.4.14.jar -P /usr/lib/presto/plugin/ranger
    
     access-control.properties:
     access-control.name=ranger-access-control
     ranger.username=<ADMIN_ROLE_USER>
     ranger.password=<ADMIN_ROLE_USER_PASSWORD>
     ranger.hive.security-config-xml=/usr/lib/presto/etc/ranger-hive-security.xml
     ranger.hive.audit-config-xml=/usr/lib/presto/etc/ranger-hive-audit.xml
    
     ranger-hive-security.xml:
     <configuration>
     <property>
          <name>ranger.plugin.hive.service.name</name>
          <value>privacera_hive</value>
     </property>
     <property>
          <name>ranger.plugin.hive.policy.pollIntervalMs</name>
          <value>5000</value>
     </property>
     <property>
          <name>ranger.service.store.rest.url</name>
          <value>
               <RANGER_ADMIN_URL>
          </value>
     </property>
     <property>
          <name>ranger.plugin.hive.policy.rest.url</name>
          <value>
               <RANGER_ADMIN_URL>
          </value>
     </property>
     <property>
          <name>ranger.service.store.rest.ssl.config.file</name>
          <value>/usr/lib/presto/etc/ranger-ssl.xml</value>
     </property>
     <property>
          <name>ranger.plugin.hive.policy.rest.ssl.config.file</name>
          <value>/usr/lib/presto/etc/ranger-ssl.xml</value>
     </property>
     </configuration>
    
    ranger-ssl.xml:
     <configuration>
     <property>
          <name>xasecure.policymgr.clientssl.truststore</name>
          <value>/etc/pki/ca-trust/extracted/java/cacerts</value>
     </property>
     <property>
          <name>xasecure.policymgr.clientssl.truststore.password</name>
          <value>crypted</value>
     </property>
     <property>
          <name>xasecure.policymgr.clientssl.truststore.credential.file</name>
          <value>jceks://file/media/ephemeral0/rangerssl/ranger.jceks</value>
     </property>
     </configuration>
    
    ranger-hive-audit.xml:
     <configuration>
     <property>
          <name>xasecure.audit.is.enabled</name>
          <value>true</value>
     </property>
     <property>
          <name>xasecure.audit.solr.is.enabled</name>
          <value>true</value>
     </property>
     <property>
          <name>xasecure.audit.solr.async.max.queue.size</name>
          <value>1</value>
     </property>
     <property>
          <name>xasecure.audit.solr.async.max.flush.interval.ms</name>
          <value>1000</value>
     </property>
     <property>
          <name>xasecure.audit.solr.solr_url</name>
          <value>
               <RANGER_AUDIT_URL>
          </value>
     </property>
     </configuration>
  4. Click Update or Update and Push.

  5. Click Start or Stop and start the cluster.

Redshift

This topic describes how to connect Redshift application to the PrivaceraCloud.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Redshift.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management and Data Discovery with toggle buttons.

    Note

    If you don't see Data Discoveryin your application, enable it in Settings > Account > Discovery. For more information, see About Account.

Enable Privacera Access Management
  1. Click the toggle button to enable the Privacera Access Managementfor your application.

  2. On the BASIC tab, enter the values in the given fields and click Save. For property details and description, see table below:

    Note

    Make sure that the other properties are advanced and should be modified in consultation with Privacera.

    Basic fields

    Table 20. Basic fields

    Field name

    Type

    Default

    Required

    Description

    Redshift JDBC URL

    string

    Yes

    Specifies the JDBC URL for the Amazon Redshift connector.

    Redshift jdbc username

    string

    Yes

    Specifies the JDBC username to use.

    For PolicySync to push policies to Amazon Redshift, this user must have superuser privileges.

    Redshift jdbc password

    string

    Yes

    Specifies the JDBC password to use.

    Redshift default database

    string

    Yes

    Specifies the name of the JDBC database to use.

    PolicySync also uses the connection to this database to load metadata and create principals such as users and groups.

    Default password for new redshift user

    string

    Yes

    Specifies the password to use when PolicySync creates new users.

    The password must meet the following requirements:

    • It must be between 8 and 64 characters long.

    • It must contain at least one uppercase letter, one lowercase letter, and one number.

    • It can use any ASCII character with the ASCII codes 33–126 except: ', ", ,, /, or @

    Redshift resource owner

    string

    No

    Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.

    • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

    • If a value is specified, the owner of the resource will be changed to the specified value.

    The following resource types are supported:

    • Database

    • Schemas

    • Tables

    • Views

    Databases to set access control policies

    string

    No

    Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of databases might resemble the following: testdb1,testdb2,sales db*.

    If specified, Databases to ignore while setting access control policies takes precedence over this setting.

    Enable policy enforcements and user/group/role management

    boolean

    false

    No

    Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is false.

    Enable access audits

    boolean

    false

    No

    Specifies whether Privacera fetches access audit data from the data source.



    Advanced fields

    Table 21. Advanced fields

    Field name

    Type

    Default

    Required

    Description

    Schemas to set access control policies

    string

    No

    Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    Use the following format when specifying a schema:

    <DATABASE_NAME>.<SCHEMA_NAME>

    If specified, Schemas to ignore while setting access control policies takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all schemas are managed:

    <DATABASE_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all schemas.

    • If set to none no schemas are managed.

    Tables to set access control policies

    string

    No

    Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    Use the following format when specifying a table:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    If specified, ignore.table.list takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all matched tables are managed:

    <DATABASE_NAME>.<SCHEMA_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all tables.

    • If set to none no tables are managed.

    Databases to ignore while setting access control policies

    string

    No

    Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.

    For example:

    testdb1,testdb2,sales_db*
    
    

    This setting supersedes any values specified by Databases to set access control policies.

    Schemas to ignore while setting access control policies

    string

    No

    Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all schemas are subject to access control.

    For example:

    testdb1.schema1,testdb2.schema2,sales_db*.sales*
    
    

    This setting supersedes any values specified by Schemas to set access control policies.

    Regex to find special characters in user names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in user names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in group names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in group names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in role names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in role names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.

    If not specified, no find and replace operation is performed.

    Persist case sensitivity of user names

    boolean

    false

    No

    Specifies whether Amazon Redshift supports case sensitivity for users. Because case sensitivity in Amazon Redshift is global, enabling this enables case sensitivity for users, groups, roles, and resources.

    Persist case sensitivity of group names

    boolean

    false

    No

    Specifies whether Amazon Redshift supports case sensitivity for groups. Because case sensitivity in Amazon Redshift is global, enabling this enables case sensitivity for users, groups, roles, and resources.

    Persist case sensitivity of role names

    boolean

    false

    No

    Specifies whether Amazon Redshift supports case sensitivity for roles. Because case sensitivity in Amazon Redshift is global, enabling this enables case sensitivity for users, groups, roles, and resources.

    Enable Case Sensitive Identifier for Reosurces

    boolean

    false

    No

    Specifies whether Amazon Redshift preserves case for user, group, role, and resource names. By default, Amazon Redshift converts all user, group, role, and resource names to lowercase. If set to true, PolicySync enables case sensitivity on a per connection basis.

    Enable Case Sensitive Identifier for Reosurces Query

    string

    SET enable_case_sensitive_identifier=true;

    No

    Specifies a query for Amazon Redshift that enables case sensitivity per connection. If you enable Enable Case Sensitive Identifier for Reosurces, then this setting defines the query that PolicySync runs.

    Create users in redshift by policysync

    boolean

    true

    No

    Specifies whether PolicySync creates local users for each user in Privacera.

    Create user roles in redshift by policysync

    boolean

    true

    No

    Specifies whether PolicySync creates local roles for each user in Privacera.

    Manage users from portal

    boolean

    true

    No

    Specifies whether PolicySync maintains user membership in roles in the Amazon Redshift data source.

    Manage groups from portal

    boolean

    true

    No

    Specifies whether PolicySync creates groups from Privacera in the Amazon Redshift data source.

    Manage roles from portal

    boolean

    true

    No

    Specifies whether PolicySync creates roles from Privacera in the Amazon Redshift data source.

    Users to set access control policies

    string

    No

    Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    If not specified, PolicySync manages access control for all users.

    If specified, Users to be ignored by access control policies takes precedence over this setting.

    An example user list might resemble the following: user1,user2,dev_user*.

    Groups to set access control policies

    string

    No

    Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of projects might resemble the following: group1,group2,dev_group*.

    If specified, Groups be ignored by access control policies takes precedence over this setting.

    Roles to set access control policies

    string

    No

    Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of projects might resemble the following: role1,role2,dev_role*.

    If specified, Roles be ignored by access control policies takes precedence over this setting.

    Users to be ignored by access control policies

    string

    No

    Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

    This setting supersedes any values specified by Users to set access control policies.

    Groups be ignored by access control policies

    string

    No

    Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

    This setting supersedes any values specified by Groups to set access control policies.

    Roles be ignored by access control policies

    string

    No

    Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.

    This setting supersedes any values specified by Roles to set access control policies.

    Prefix of redshift roles for portal users

    string

    priv_user_

    No

    Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named <USER> defined in Privacera and the role prefix is priv_user_, the local role is named priv_user_<USER>.

    Prefix of redshift roles for portal groups

    string

    priv_group_

    No

    Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

    Prefix of redshift roles for portal roles

    string

    priv_role_

    No

    Specifies the prefix that PolicySync uses when creating roles from Privacera in the Amazon Redshift data source.

    For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

    Use redshift native public group for public group access policies

    boolean

    true

    No

    Specifies whether PolicySync uses the Amazon Redshift native public group for access grants whenever a policy refers to a public group. The default value is true.

    Set access control policies only on the users from managed groups

    boolean

    false

    No

    Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.

    Set access control policies only on the users/groups from managed roles

    boolean

    false

    No

    Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.

    Enforce masking policies using secure views

    boolean

    true

    No

    Specifies whether to use secure view based masking. The default value is true.

    Enforce row filter policies using secure views

    boolean

    true

    No

    Specifies whether to use secure view based row filtering. The default value is true.

    While Amazon Redshift supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

    Create secure view for all tables/views

    boolean

    true

    No

    Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

    Default masked value for numeric datatype columns

    integer

    0

    No

    Specifies the default masking value for numeric column types.

    Default masked value for text/varchar datatype columns

    string

    <MASKED>

    No

    Specifies the default masking value for text and string column types.

    Secure view name prefix

    string

    No

    Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

    Secure view name postfix

    string

    _secure

    No

    Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

    Secure view schema name prefix

    string

    No

    Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view schema name for a schema named example1 is dev_example1.

    Secure view schema name postfix

    string

    No

    Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a schema named example1 is example1_dev.

    Enable dataadmin

    boolean

    true

    No

    This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

    Users to exclude when fetching access audits

    string

    REDSHIFT_JDBC_USERNAME

    No

    Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".

    Initial delay for access audit

    integer

    30

    No

    Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Amazon Redshift.



    Custom fields

    Table 22. Custom fields

    Canonical name

    Type

    Default

    Description

    load.resources

    string

    load_from_database_columns

    Specifies how PolicySync loads resources from Amazon Redshift. The following values are allowed:

    • load_md: Load resources from Amazon Redshift with a top-down resources approach, that is, it first loads the databases and then the schemas followed by tables and its columns.

    • load_from_database_columns: Load resources one by one for each resource type that is, it loads all databases first, then it loads all schemas in all databases, followed by all tables in all schemas and its columns. This mode is recommended since it is faster than the load mode.

    sync.interval.sec

    integer

    60

    Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

    sync.serviceuser.interval.sec

    integer

    420

    Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

    sync.servicepolicy.interval.sec

    integer

    540

    Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

    audit.interval.sec

    integer

    30

    Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

    ignore.table.list

    string

    Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    This setting supersedes any values specified by Tables to set access control policies.

    user.name.case.conversion

    string

    lower

    Specifies how user name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of user names is set to true.

    group.name.case.conversion

    string

    lower

    Specifies how group name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of group names is set to true.

    role.name.case.conversion

    string

    lower

    Specifies how role name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of role names is set to true.

    secure.view.name.remove.suffix.list

    string

    Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    secure.view.schema.name.remove.suffix.list

    string

    Specifies a suffix to remove from a schema name. For example, if a schema is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    perform.grant.updates.max.retry.attempts

    integer

    2

    Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is 2.



  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

Enable Data Discovery

Click the toggle button to enable the Data Discovery for your application.

  1. On the BASIC tab, enter values in the following fields.

    • JDBC URL

    • JDBC Username 

    • JDBC Password

  2. On the ADVANCED tab, you can add custom properties.

  3. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  4. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

Add Data Source

To add a resources using this connection as Discovery targets, see Privacera Discovery scan targets.

Redshift Spectrum

Redshift Spectrum

This topic describes how to configure access control for Redshift Spectrum PolicySync using PrivaceraCloud.

Privacera supports access control for Redshift Spectrum only on the following:

  • Create Database

  • Usage Schema

Prerequisites

The following prerequisites must be met to use the Redshift Spectrum:

  1. You will require an Amazon Redshift cluster and a SQL client connected to the cluster.

  2. The AWS Region in which the Amazon Redshift cluster and Amazon S3 bucket are located must be the same.

  3. The Redshift application must be connected with PrivaceraCloud.

Getting started

Redshift Spectrum supports the creation of external tables within Redshift cluster in four simple steps:

Major Security Concern

Redshift does not support Access control lists (ACLs) on EXTERNAL TABLES; to gain access to the data (EXTERNAL TABLES), you must provide USAGE schema permission on the EXTERNAL SCHEMA.

Limitations

The following are the limitations with Redshift Spectrum:

  • If the USAGE permission is granted to EXTERNAL SCHEMA, the user gains access to all of its tables.

  • Access to any of the external tables cannot be explicitly granted or revoked.

  • The creation of Redshift managed tables (not EXTERNAL TABLES) is not permitted within an "EXTERNAL SCHEMA".

  • The creation of secure views is not permitted within an EXTERNAL SCHEMA.

Privacera has never managed external tables due to the limitations listed above. By default, we manage permissions for external schemas at the schema level.

Support for Row Level Filter and Column Masking on the basis of Secure Views on EXTERNAL SCHEMA is possible, but only with the user's CONSENT, as the user will also have direct access to the EXTERNAL TABLE If they query the table's data, neither the Row Level Filter nor the Column Masking will be applied.

Note

We do not recommend this solution, but if you agree that users will not query the data directly (via external tables), we can enable it by adding the REDSHIFT_ENABLE_EXTERNAL_SCHEMA_SUPPORT property (default behavior is set to false).

Proposed Solution

On an EXTERNAL TABLE, we supports Row Level Filter and Column Masking to a limited extent.

  • Instead of creating a table, we create a secure view with the _secure postfix added to the schema name (as we cannot create Redshift views inside external schemas).

  • To GRANT access to secure view, we must grant USAGE permission to the Source Schema because the secure view schema will be separated from the EXTERNAL SCHEMA. As a result, permission is granted to the source (actual) table.

  • Only Select Permission to the EXTERNAL TABLE is supported. DataAdmin permission is ineffective because USAGE permission to EXTERNAL SCHEMA allows direct access to EXTERNAL TABLE.

Configuration

Note

Due to limitations, EXTERNAL SCHEMA support for Row Level Filter and Column Masking is not recommended.

Enable external schema

To enable the external schema, perform the following steps:

Note

This Enable external schema toggle button should not be enabled without consent after reading the documentation.

  1. Go to Settings > Applications.

  2. Select the Redshift application, which is already linked to PrivaceraCloud.

  3. Click the Account Name or the edit button for the account on which you want to enable Redshift Spectrum.

  4. In the Access Management section, click the toggle button.

  5. In the ADVANCED tab, click the Enable external schema toggle button.

  6. In the Confirmation window, click YES, and then click SAVE.

Property Configuration

The values in the following fields must be left blank:

Secure view name prefix
Secure view name postfix 

The value of one of the following fields must be set:

Secure view schema name prefix
Secure view schema name postfix 

Kinesis

This topic describes how to connect Kinesis application to PrivaceraCloud.

Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.

Prerequisites in AWS console

The following prerequisites must be met:

  1. Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.

  2. Configure a Trust relationship with PrivaceraCloud See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Kinesis.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management and with the toggle buttons.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.

Enable Privacera Access Management
  1. Click the toggle button to enable Privacera Access Management for your application.

  2. On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key.

      2. AWS Secret Key: AWS data repository host account Secret Key

      3. AWS Region: AWS S3 bucket region.

    • With Use IAM Role enabled:

      1. AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.

      2. AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.

        Note

        The external ID is stored encrypted. It is never reflected back to the UI or is made visible.

      3. AWS Region: AWS S3 bucket region.

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

    Note

    You can only use one S3 setup per account for Privacera Access Management

  6. Recommended: Install the AWS CLI.

    Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud S3 Data Server proxy.

  7. Recommended: Validate connectivity by running AWS CLI for S3 such as:

    aws s3 ls

Note

Dataserver also supports logging the requested user's name in AWS CloudWatch Logs. For more information see Add UserInfo in S3 Requests sent via Dataserver.

Enable Data Discovery
  1. Click the toggle button to enable Data Discovery for your application.

  2. On the BASIC tab, enter values in the following fields.

    • With Use IAM Role disabled:

      1. AWS Access Key: AWS data repository host account Access Key.

      2. AWS Secret Key: AWS data repository host account Secret Key

      3. AWS Region: AWS S3 bucket region.

    • With Use IAM Role enabled:

      1. AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.

      2. AWS Region: AWS S3 bucket region.

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

Go to PrivaceraCloud > Privacera Discovery > Data Source to add a resources using this connection as Discovery targets. See Privacera Discovery scan targets for quick start steps.

Snowflake

This topic describes how to connect the Snowflake application to the PrivaceraCloud using the AWS and Azure platforms.

Prerequisites

Before connecting Snowflake application to PrivaceraCloud, you must first manually create the Snowflake warehouse, database, users, and roles required by PolicySync.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Snowflake.

  3. Select the platform type (AWS or Azure) on which you want to configure the Snowflake application.

  4. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management and Data Discovery with toggle buttons.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.

Enable Privacera Access Management
  1. Click the toggle button to enable the Privacera Access Management for your application.

  2. On the BASIC tab, enter the values in the given fields and click Save. For property details and description, see table below:

    Note

    Make sure that the other properties are advanced and should be modified in consultation with Privacera.

    Basic fields

    Table 23. Basic fields

    Field name

    Type

    Default

    Required

    Description

    Snowflake JDBC Url

    string

    Yes

    Specifies the JDBC URL for the Snowflake connector.

    Snowflake JDBC Username

    string

    Yes

    Specifies the JDBC username to use.

    Snowflake JDBC Password

    string

    Yes

    Specifies the JDBC password to use.

    Enable Use Key Pair Authentication

    boolean

    false

    Yes

    Specifies whether PolicySync uses key-pair authentication.

    Enable this setting to true to enable key pair authentication.

    Snowflake JDBC private key

    string

    No

    Specifies the contents of the private key file to use with Snowflake. For example:

    -----BEGIN ENCRYPTED PRIVATE KEY----- MIIE6TAbBgkqhkiG9w0BBQMwDgQILYPyCppzOwECAggABIIEyLiGSpeeGSe3xHP1wHLjfCYycUPennlX2bd8yX8xOxGSGfvB+99+PmSlex0FmY9ov1J8H1H9Y3lMWXbL... -----END ENCRYPTED PRIVATE KEY-----

    Snowflake JDBC private key password

    string

    No

    Specifies the password for the private key. If the private key does not have a password, do not specify this setting.

    Snowflake Warehouse To Use

    string

    Yes

    Specifies the JDBC warehouse that PolicySync establishes a connection to, which is used to run SQL queries.

    Snowflake Role To Use

    string

    Yes

    Specifies the role that PolicySync uses when it runs SQL queries.

    Snowflake Resource Owner

    string

    No

    Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.

    • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

    • If a value is specified, the owner of the resource will be changed to the specified value.

    The following resource types are supported:

    • Database

    • Schemas

    • Tables

    • Views

    Warehouses to set access control policies

    string

    No

    Specifies a comma-separated list of warehouse names for which PolicySync manages access control. If unset, access control is managed for all warehouses. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of warehouses might resemble the following:

    testdb1warehouse,testdb2warehouse, sales_dbwarehouse*
    
    

    Databases to set access control policies

    string

    No

    Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of databases might resemble the following: testdb1,testdb2,sales db*.

    If specified, Databases to be ignored by access policy takes precedence over this setting.

    Default password for new snowflake user

    string

    Yes

    Specifies the password to use when PolicySync creates new users.

    Enable policy enforcements and user/group/role management

    boolean

    true

    No

    Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

    Database name where masking function for column access control will be created

    string

    No

    Specifies the name of the database where PolicySync creates custom masking functions.

    Enable access audits

    boolean

    true

    Yes

    Specifies whether Privacera fetches access audit data from the data source.

    Enable simple audits

    boolean

    true

    No

    Specifies whether to enable simple auditing. When enabled, PolicySync gathers the following audit information from the database:

    • RequestData (query text)

    • AccessResult (execute status)

    • AccessType (query type)

    • User (username)

    • ResourcePath (database_name.schema_name)

    • EventTime (query time)

    • AclEnforcer (connector name)

    If you enabled this setting, do not enable Enable advance audits.

    Enable advance audits

    boolean

    false

    No

    Specifies whether to enable advanced auditing. When enabled, PolicySync gathers the following audit information from the database:

    • AccessResult (execute status)

    • AccessType (query type)

    • User (username)

    • ResourcePath (database_name.schema_name.column_names)

    • EventTime (query time)

    • AclEnforcer (connector name)

    If you enabled this setting, do not enable Enable simple audits.



    Advanced fields

    Table 24. Advanced fields

    Field name

    Type

    Default

    Required

    Description

    Schemas to set access control policies

    string

    No

    Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    Use the following format when specifying a schema:

    <DATABASE_NAME>.<SCHEMA_NAME>

    If specified, Schemas to be ignored by access policy takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all schemas are managed:

    <DATABASE_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all schemas.

    • If set to none no schemas are managed.

    Tables to set access control policies

    string

    No

    Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    Use the following format when specifying a table:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    If specified, ignore.table.list takes precedence over this setting.

    If you specify a wildcard, such as in the following example, all matched tables are managed:

    <DATABASE_NAME>.<SCHEMA_NAME>.*

    The specified value, if any, is interpreted in the following ways:

    • If unset, access control is managed for all tables.

    • If set to none no tables are managed.

    Stream to set access control policies

    string

    No

    Specifies a comma-separated list of stream names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of streams might resemble the following:

    testdb1.schema1.stream1,testdb2.schema2.stream*
    
    

    If unset, access control is managed for all streams.

    Functions to set access control policies

    string

    No

    Specifies a comma-separated list of function names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of functions might resemble the following:

    testdb1.schema1.fn1,testdb2.schema2.fn*
    
    

    If unset, access control is managed for all functions.

    Procedures to set access control policies

    string

    No

    Specifies a comma-separated list of procedure names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of procedures might resemble the following:

    testdb1.schema1.procedureA,testdb2.schema2.procedure*
    
    

    If unset, access control is managed for all procedures.

    Sequences to set access control policies

    string

    No

    Specifies a comma-separated list of sequence names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of sequences might resemble the following:

    testdb1.schema1.seq1,testdb2.schema2.seq*
    
    

    If unset, access control is managed for all sequences.

    FileFormat to set access control policies

    string

    No

    Specifies a comma-separated list of file format names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of file formats might resemble the following:

    testdb1.schema1.fileFmtA,testdb2.schema2.fileFmt*
    
    

    If unset, access control is managed for all file formats.

    Pipes to set access control policies

    string

    No

    Specifies a comma-separated list of pipe names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of pipes might resemble the following:

    testdb1.schema1.pipeA,testdb2.schema2.pipe*
    
    

    If unset, access control is managed for all pipes.

    ExternalStage to set access control policies

    string

    No

    Specifies a comma-separated list of external stage names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of external stages might resemble the following:

    testdb1.schema1.externalStage1,testdb2.schema2.extStage*
    
    

    If unset, access control is managed for all external stages.

    InternalStage to set access control policies

    string

    No

    Specifies a comma-separated list of internal stages names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    An example list of internal stages might resemble the following:

    testdb1.schema1.internalStage1,testdb2.schema2.intStage*
    
    

    If unset, access control is managed for all internal stages.

    Warehouses to be ignored by access policy

    string

    No

    Specifies a comma-separated list of warehouse names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all warehouses are subject to access control.

    This setting supersedes any values specified by Warehouses to set access control policies.

    Databases to be ignored by access policy

    string

    DEMO_DB,SNOWFLAKE,UTIL_DB,SNOWFLAKE_SAMPLE_DATA

    No

    Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.

    For example:

    testdb1,testdb2,sales_db*
    
    

    This setting supersedes any values specified by Databases to set access control policies.

    Schemas to be ignored by access policy

    string

    *.INFORMATION_SCHEMA

    No

    Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all schemas are subject to access control.

    For example:

    testdb1.schema1,testdb2.schema2,sales_db*.sales*
    
    

    This setting supersedes any values specified by Schemas to set access control policies.

    Create user in snowflake by policysync

    boolean

    true

    No

    Specifies whether PolicySync creates local users for each user in Privacera.

    Create user role in snowflake by policysync

    boolean

    true

    No

    Specifies whether PolicySync creates local roles for each user in Privacera.

    Enable use of email as login for snowflake

    boolean

    false

    No

    Specifies whether PolicySync uses the user email address as the login name when creating a new user in Snowflake.

    Prefix of snowflake roles for portal users

    string

    No

    Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named <USER> defined in Privacera and the role prefix is priv_user_, the local role is named priv_user_<USER>.

    Prefix of snowflake roles for portal groups

    string

    No

    Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

    Prefix of snowflake roles for portal roles

    string

    No

    Specifies the prefix that PolicySync uses when creating roles from Privacera in the Snowflake data source.

    For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

    Manage users form portal

    boolean

    No

    Specifies whether PolicySync maintains user membership in roles in the Snowflake data source.

    Manage group form portal

    boolean

    No

    Specifies whether PolicySync creates groups from Privacera in the Snowflake data source.

    Manage role form portal

    boolean

    No

    Specifies whether PolicySync creates roles from Privacera in the Snowflake data source.

    Users to set access control policy

    string

    No

    Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

    If not specified, PolicySync manages access control for all users.

    If specified, Users to be ignored by access control policy takes precedence over this setting.

    An example user list might resemble the following: user1,user2,dev_user*.

    Groups to set access control policy

    string

    No

    Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of projects might resemble the following: group1,group2,dev_group*.

    If specified, Groups to be ignored by access control policy takes precedence over this setting.

    Roles to set access control policy

    string

    No

    Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.

    An example list of projects might resemble the following: role1,role2,dev_role*.

    If specified, Roles to be ignored by access control policy takes precedence over this setting.

    Users to be ignored by access control policy

    string

    No

    Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

    This setting supersedes any values specified by Users to set access control policy.

    Groups to be ignored by access control policy

    string

    No

    Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

    This setting supersedes any values specified by Groups to set access control policy.

    Roles to be ignored by access control policy

    string

    No

    Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.

    This setting supersedes any values specified by Roles to set access control policy.

    Regex to find special characters in user names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in user names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in group names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in group names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.

    If not specified, no find and replace operation is performed.

    Regex to find special characters in role names

    string

    [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

    No

    Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.

    If not specified, no find and replace operation is performed.

    String to replace with the special characters found in role names

    string

    _

    No

    Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.

    If not specified, no find and replace operation is performed.

    Persist case sensitivity of user names

    boolean

    false

    No

    Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

    Persist case sensitivity of group names

    boolean

    false

    No

    Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

    Persist case sensitivity of role names

    boolean

    false

    No

    Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to true, case sensitivity is preserved.

    Set access control policies only on the users from managed groups

    boolean

    false

    No

    Specifies whether to manage only the users that are members of groups specified by Groups to set access control policy. The default value is false.

    Set access control policies only on the users/groups from managed roles

    boolean

    false

    No

    Specifies whether to manage only users that are members of the roles specified by Roles to set access control policy. The default value is false.

    Enable Column Access Exception

    boolean

    true

    No

    Specifies whether an access denied exception is displayed if a user does not have access to a table column and attempts to access that column.

    If enabled, you must set Enforce Snowflake Native Masking to true.

    Enforce Snowflake Native Masking

    boolean

    true

    No

    Specifies whether PolicySync enables native masking policy creation functionality.

    Enforce Snowflake Native row filter

    boolean

    true

    No

    Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.

    Enforce row filter policies using secure views

    boolean

    false

    No

    Specifies whether to use secure view based row filtering. The default value is false.

    While Snowflake supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

    Enforce masking policies using secure views

    boolean

    false

    No

    Specifies whether to use secure view based masking. The default value is false.

    Secure view schema name prefix

    string

    No

    Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view schema name for a schema named example1 is dev_example1.

    Secure view schema name postfix

    string

    No

    Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a schema named example1 is example1_dev.

    Secure view name prefix

    string

    No

    Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

    Secure view name postfix

    string

    _SECURE

    No

    Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

    If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

    Create secure view for all tables/views

    boolean

    false

    No

    Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

    Default masked value for numeric datatype columns

    integer

    0

    No

    Specifies the default masking value for numeric column types.

    Default masked value for text/varchar datatype columns

    string

    <MASKED>

    No

    Specifies the default masking value for text and string column types.



    Custom fields

    Table 25. Custom fields

    Canonical name

    Type

    Default

    Description

    jdbc.maximum.pool.size

    integer

    15

    Specifies the maximum size for the JDBC connection pool.

    jdbc.min.idle.connection

    integer

    3

    Specifies the minimum size of the JDBC connection pool.

    jdbc.leak.detection.threshold

    string

    900000L

    Specifies the duration in milliseconds that a connection is not part of the connection pool before PolicySync logs a possible connection leak message. If set to 0, leak detection is disabled.

    handle.pipe.ownership

    boolean

    false

    Specifies whether PolicySync changes the ownership of a pipe to the role specified by Snowflake Resource Owner.

    ignore.table.list

    string

    Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:

    <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
    
    

    This setting supersedes any values specified by Tables to set access control policies.

    ignore.stream.list

    string

    Specifies a comma-separated list of stream names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all streams are subject to access control.

    This setting supersedes any values specified by Stream to set access control policies.

    ignore.function.list

    string

    Specifies a comma-separated list of functions names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all functions are subject to access control.

    This setting supersedes any values specified by Functions to set access control policies.

    ignore.procedure.list

    string

    Specifies a comma-separated list of procedures names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all procedures are subject to access control.

    This setting supersedes any values specified by Procedures to set access control policies.

    ignore.sequence.list

    string

    Specifies a comma-separated list of sequences names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all sequences are subject to access control.

    This setting supersedes any values specified by Sequences to set access control policies.

    ignore.file_format.list

    string

    Specifies a comma-separated list of file format names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all file formats are subject to access control.

    This setting supersedes any values specified by FileFormat to set access control policies.

    ignore.pipe.list

    string

    Specifies a comma-separated list of pipes names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all pipes are subject to access control.

    This setting supersedes any values specified by Pipes to set access control policies.

    ignore.external_stage.list

    string

    Specifies a comma-separated list of external stage names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all external stages are subject to access control.

    This setting supersedes any values specified by ExternalStage to set access control policies.

    ignore.internal_stage.list

    string

    Specifies a comma-separated list of internal stage names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all internal stages are subject to access control.

    This setting supersedes any values specified by InternalStage to set access control policies.

    user.name.case.conversion

    string

    lower

    Specifies how user name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of user names is set to true.

    group.name.case.conversion

    string

    lower

    Specifies how group name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of group names is set to true.

    role.name.case.conversion

    string

    lower

    Specifies how role name conversions are performed. The following options are valid:

    • lower: Convert to lowercase

    • upper: Convert to uppercase

    • none: Preserve case

    This setting applies only if Persist case sensitivity of role names is set to true.

    user.filter.with.email

    boolean

    false

    Set this property to true if you only want to manage users who have an email address associated with them in the portal.

    User.role.use.upper.case

    boolean

    false

    Specifies whether PolicySync converts a user role name to uppercase when performing operations.

    Group.role.use.upper.case

    boolean

    false

    Specifies whether PolicySync converts a group name to uppercase when performing operations.

    Role.role.use.upper.case

    boolean

    false

    Specifies whether PolicySync converts a role name to uppercase when performing operations.

    perform.grant.updates.batch

    string

    Specifies whether PolicySync applies grants and revokes in batches. If enabled, this behavior improves overall performance of applying permission changes.

    perform.grant.updates.max.retry.attempts

    integer

    2

    Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is 2.

    enable.privileges.batching

    boolean

    false

    Specifies whether PolicySync applies privileges described in Access Manager policies.

    masking.policy.db.name

    string

    Specifies the name of the database where PolicySync creates custom masking policies.

    masking.policy.schema.name

    string

    PUBLIC

    Specifies the name of the schema where PolicySync creates all native masking policies. If not specified, the resource schema is used as the masking policy schema.

    masking.policy.name.template

    string

    {database}{separator}{schema}{separator}{table}

    Specifies a naming template that PolicySync uses when creating native masking policies. For example, given the following values:

    • {database}: customer_db

    • {schema}: customer_schema

    • {table}: customer_data

    • {separator} _priv_

    With the default naming template, the following name is used when creating a native masking policy. The {column} field is replaced by the column name.

    customer_db_priv_customer_schema_priv_customer_data_{column}
    
    

    row.filter.policy.db.name

    string

    Specifies the name of the database where PolicySync creates native row-filter policies. If not specified, the resource database is considered the same as the row-filter policy database.

    row.filter.policy.schema.name

    string

    PUBLIC

    Specifies the name of the schema where PolicySync creates all native row-filter policies. If not specified, the resource schema is considered the same as the row-filter policy schema.

    row.filter.policy.name.template

    string

    {database}{separator}{schema}{separator}{table}

    Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table data from the schema schema that resides in the db database, the row filter policy name might resemble the following:

    db_priv_schema_priv_data_<ROW_FILTER_ITEM_NUMBER>
    
    

    secure.view.schema.name.remove.suffix.list

    string

    Specifies a suffix to remove from a schema name. For example, if a schema is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    secure.view.name.remove.suffix.list

    string

    Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    secure.view.database.name.prefix

    string

    Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

    For example, if the prefix is priv_, then the secure view name for a database named example1 is priv_example1.

    secure.view.database.name.postfix

    string

    Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

    For example, if the postfix is _sec, then the secure view name for a database named example1 is example1_sec.

    secure.view.database.name.remove.suffix.list

    string

    Specifies a suffix to remove from a database name. For example, if the database is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

    You can specify a single suffix or a comma separated list of suffixes.

    policy.name.separator

    string

    _PRIV_

    Specifies a string to use as part of the name of native row filter and masking policies.

    row.filter.alias.token

    string

    obj

    Specifies an identifier that PolicySync uses to identify columns from the main table and parse each correctly.

    masked.double.value

    integer

    0

    Specifies the default masking value for DOUBLE column types.

    masked.date.value

    string

    Specifies the default masking value for date column types.

    peg.functions.db.name

    string

    Specifies the name of the database where the PEG encryption functions reside.

    peg.functions.schema.name

    string

    public

    Specifies the schema name where the PEG encryption functions reside.

    load.roles

    string

    load_md

    Specifies the method that PolicySync uses to load roles from Snowflake. The following methods are supported:

    load_md: Use metadata queries

    load.users

    string

    load_md

    Specifies how PolicySync loads users from Snowflake. The following values are valid:

    • load

    • load_db

    load.resources

    string

    load_md_from_account_columns

    Specifies how PolicySync loads resources from Snowflake. The following values are allowed:

    • load_md: Load the resources using metadata queries.

    • load_md_from_account_columns: Load resources by directly running SHOW QUERIES on the account. This mode is preferred when you want to manage an entire Snowflake account.

    • load_md_from_database_columns: Load the resources by directly running SHOW QUERIES only on managed databases. This mode is preferred when you want to manage only a few databases.

    load.policies

    string

    Specifies the method that PolicySync uses to load existing grants from Snowflake. The following methods are supported:

    load_md: Use metadata queries

    load.audits

    string

    Specifies the method that PolicySync uses to load access audit information.

    The following values are valid:

    • load: Use SQL queries The following values are valid:

    audit.enable.resource.filter

    boolean

    Specifies whether PolicySync filters access audit information by managed resources, such as databases, schemas, and so forth.

    audit.initial.pull.min

    string

    30

    Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Snowflake.

    custom.audit.db.name

    string

    PRIVACERA_ACCESS_LOGS_DB

    Specifies the database that PolicySync retrieves access audits from. This setting applies only if you set Enable advance audits to true.

    sync.interval.sec

    integer

    60

    Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

    sync.serviceuser.interval.sec

    integer

    420

    Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

    sync.servicepolicy.interval.sec

    integer

    60

    Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

    audit.interval.sec

    integer

    30

    Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

    jdbc.application

    string

    Specifies the name of a partner application to connect to through JDBC. This setting is for Snowflake partner use only.



  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

Object permission mapping

For more information about object permission mapping , see Snowflake Documentation.

Object

Supported Permissions

Description

Global

CreateWarehouse

CreateDatabase

Enables creating a new virtual warehouse.

Enables creating a new database in the system.

Warehouse

UseWarehouse

Operate

Monitor

Modify

Enables using a virtual warehouse and, as a result, executing queries on the warehouse.

Enables changing the state of a warehouse (stop, start, suspend, resume).

Enables viewing current and past queries executed on a warehouse as well as usage statistics on that warehouse.

Enables altering any properties of a warehouse, including changing its size

Database

UseDB

CreateSchema

Enables using a database, including returning the database details in the SHOW DATABASES command output.

Enables creating a new schema in a database, including cloning a schema.

Schema

UseSchema

CreateTable

CreateProcedure

CreateFunction

CreateStream

CreateSequence

CreateFileFormat

CreateStage

CreatePipe

CreateExternalTable

Enables using a schema, including returning the schema details in the SHOW SCHEMAS command output.

Enables creating a new table in a schema, including cloning a table.

Enables creating a new stored procedure in a schema.

Enables creating a new UDF or external function in a schema.

Enables creating a new stream in a schema, including cloning a stream.

Enables creating a new sequence in a schema, including cloning a sequence.

Enables creating a new file format in a schema, including cloning a file format.

Enables creating a new stage in a schema, including cloning a stage.

Enables creating a new pipe in a schema.

Enables creating a new external table in a schema.

Table

Select

Insert

Update

Delete

Truncate

References

Enables executing a SELECT statement on a table.

Enables executing an INSERT command on a table

.Enables executing an UPDATE command on a table.

Enables executing a DELETE command on a table.

Enables executing a TRUNCATE TABLE command on a table.

Enables referencing a table as the unique/primary key table for a foreign key constraint.

View

Select

Enables executing a SELECT statement on a view.

Procedure

Usage

Enables calling a stored procedure.

Function

Usage

Enables calling a function.

Stream

Select

Enables executing a SELECT statement on a stream.

File_format

Usage

Enables using a file format in a SQL statement.

Sequence

Usage

Enables using a sequence in a SQL statement.

Internal_stage

Read

Write

Enables performing any operations that require reading from an internal stage (GET, LIST, COPY INTO &lt;table&gt;);

Enables performing any operations that require writing to an internal stage (PUT, REMOVE, COPY INTO &lt;location&gt;);

External_stage

Usage

Enables using an external stage object in a SQL statement;

Pipe

Operate

Monitor

Enables viewing details for the pipe (using DESCRIBE PIPE or SHOW PIPES), pausing or resuming the pipe, and refreshing the pipe.

Enables viewing details for the pipe (using DESCRIBE PIPE or SHOW PIPES).

Enable Data Discovery

Click the toggle button to enable the Data Discovery for your application.

  1. On the BASIC tab, enter values in the following fields.

    • JDBC URL

    • JDBC Username 

    • JDBC Password

  2. On the ADVANCED tab, you can add custom properties.

  3. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  4. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

Add Data Source

To add a resources using this connection as Privacera Discovery targets, see Discovery Scan Targets.

Starburst Enterprise with PrivaceraCloud

 

PrivaceraCloud can provide system-wide access control across all data exposed in Starburst Enterprise.

Both privacera_hive and privacera_starburstenterprise resource policies can be used to integrate with both Starburst managed sources as well as 3rd party sources (such as Databricks) to maintain policy consistency.

Note

A common implementation pattern for data sources not directly supported in Privacera is to use Starburst Enterprise as a point of access policy enforcement. Create a layer of views in Starburst Enterprise on top of the unsupported source, apply Privacera access control policies to those views, and then limit most access to the source outside of Starburst Enterprise.

Starburst Enterprise is often deployed using a pre-built Docker image provided by Starburst. Using a Docker image for testing and single node deployment can be significantly faster than working with either RPM or tarball deployments. The instructions here describe the container-based deployment but other environments are similar. The following information explains how to configure Starburst Enterprise with port 8443 for TLS/HTTPS so that usernames/passwords are possible.

Note

PrivaceraCloud is a managed service so there is currently no option for connecting to a secured Starburst Enterprise instance that utilizes self-signed SSL certificates. The reason for this is because self-signed certificates are not chained to a publicly authenticated root certificate authority ("ca-root").

Prerequisites

The following items need to be enabled or shared prior to deploying a Starburst Docker image:

  • A licensed version of Starburst.

  • Docker-ce 18+ must be installed.

  • JDK 11 to generate the Java keystore.

  • JDBC URL to connect to the Starburst Enterprise instance, including catalog and schema. Unless you specify a catalog name, the JDBC connection is validated only to the host level.

  • CA-signed SSL certificate for production deployment.

  • Your PrivaceraCloud API Key.

Configure Privacera plug-in with Starburst Enterprise

Note

The Docker image already includes the Privacera plug-in needed for policy enforcement. Your tasks will be to create or update a number of configuration files in the container.

Summary of steps:

  • Generate access-control file(s) for Starburst (required) and for Hive catalogs (optional).

  • Generate a Ranger Audit XML file.

  • Generate a Ranger SSL XML file.

  • Generate a PrivaceraCloud JCEKS file.

Generate access-control files

To enable Privacera for authorization, you need to update the etc/config.properties file with one of the following entries:

 prop

    # privacera auth for hive and system access control
    access-control.config-files=/etc/starburst/access-control-privacera.properties,/etc/starburst/access-control-priv-hive.properties

Or

 prop

    # privacera auth for only system access control
    access-control.config-files=/etc/starburst/access-control-privacera.properties
Generate a Ranger Audit XML file

The example below depends on your individual PrivaceraCloud API Key, which you must insert in three places below.

etc/ranger-hive-audit.xml

<?xml version="1.0" encoding="UTF-8"?>
    <configuration>
    <property>
        <name>ranger.plugin.hive.service.name</name>
        <value>privacera_hive</value>
    </property>
    <property>
        <name>ranger.plugin.hive.policy.pollIntervalMs</name>
        <value>5000</value>
    </property>
    <property>
        <name>ranger.service.store.rest.url</name>
        <value>
    https://<YOUR_PRIVACERACLOUD_API_URL>/<API_KEY>
        </value>
    </property>

    <property>
        <name>ranger.plugin.hive.policy.rest.url</name>
        <value>
    https://<YOUR_PRIVACERACLOUD_API_URL>/<API_KEY>
        </value>
    </property>
    <property>
        <name>ranger.plugin.hive.policy.source.impl</name>
        <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>
        <description>
            Class to retrieve policies from the source
        </description>
    </property>
    <property>
        <name>ranger.plugin.hive.policy.rest.ssl.config.file</name>
    <value>/etc/starburst/ranger-policymgr-ssl.xml</value>
        <description>
            Path to the file containing SSL details to contact Ranger Admin
        </description>
    </property>
    <property>
        <name>ranger.service.store.rest.ssl.config.file</name>
    <value>/etc/starburst/ranger-policymgr-ssl.xml</value>
    </property>
    <property>
        <name>ranger.plugin.hive.policy.cache.dir</name>
    <value>/etc/starburst/tmp/ranger</value>
        <description>
            Directory where Ranger policies are cached after successful retrieval from the source
        </description>
    </property>
    <property>
        <name>ranger.plugin.starburst-enterprise-presto.policy.cache.dir</name>
    <value>/etc/starburst/tmp/ranger</value>
        <description>
                Directory where Ranger policies are cached after successful retrieval from the source
        </description>
    </property>
    <property>
        <name>xasecure.audit.destination.solr</name>
        <value>true</value>
    </property>
    <property>
        <name>xasecure.audit.destination.solr.batch.filespool.dir</name>
    <value>/etc/starburst/tmp/solr</value>
    </property>
    <property>
        <name>xasecure.audit.destination.solr.urls</name>
        <value>
    https://<YOUR_PRIVACERACLOUD_API_URL>/<API_KEY>/solr/ranger_audits
        </value>
    </property>
    <property>
        <name>xasecure.audit.is.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>xasecure.audit.solr.is.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>xasecure.audit.solr.async.max.queue.size</name>
        <value>1</value>
    </property>
    <property>
        <name>xasecure.audit.solr.async.max.flush.interval.ms</name>
        <value>1000</value>
    </property>
</configuration>

To install this file into the Docker container, you can add an option to your container creation script:

-v $DOCKER_HOME/$STARBURST_VERSION/etc/ranger-hive-audit.xml:$STARBURST_TGT/ranger-hive-audit.xml
Generate a Ranger SSL XML file

The Ranger SSL XML file is needed when using PrivaceraCloud. This is because it uses API keys and the location of the JDK inside the Starburst containers might be different than other installations.

After the change from starburst-presto to starburst-trino releases, the JDK installation was updated by the Starburst engineering team.

Note

The <value> tags that follow should be verified periodically or based on best practices from Starburst engineering or partner teams.

etc/ranger-policymgr-ssl.xml

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
    <name>xasecure.policymgr.clientssl.truststore</name>
    <value>/etc/pki/java/cacerts</value>
    </property>
    <property>
    <name>xasecure.policymgr.clientssl.truststore.password</name>
    <value>crypted</value>
    </property>
    <property>
    <name>xasecure.policymgr.clientssl.truststore.credential.file</name>
    <value>jceks://file/etc/starburst/privaceracloud.jceks</value>
    </property>
</configuration>

To install this file into the Docker container, you can add an option to your container creation script:

-v $DOCKER_HOME/$STARBURST_VERSION/etc/ranger-policymgr-ssl.xml:$STARBURST_TGT/ranger-polcymgr-ssl.xml
Generate a PrivaceraCloud JCEKS file

Edit etc/privaceracloud.jceks.

This file is for an encrypted password for reading or accessing the Java CACerts inside the Starburst containers.

If you are generating a JCEKS with the default Java Truststore password (“changeit”), you can use an existing Hive Metastore environment or an Hadoop distribution that is running Java 8 or newer.

Example CLI:

hadoop credential create sslTrustStore -value changeit -provider localjceks://file/var/tmp/privaceracloud.jceks

To install this file into the Docker container, you can add an option to your container creation script:

-v $DOCKER_HOME/$STARBURST_VERSION/etc/privaceracloud.jceks:$STARBURST_TGT/privaceracloud.jceks
Connect Starburst Enterprise application

Use the following steps to connect Starburst Enterprise application to the PrivaceraCloud for Privacera Access Management.

  1. Login to PrivaceraCloud.

  2. Go to Settings > Applications.

  3. On the Applications screen, select Starburst Enterprise.

  4. Enter the application Name and Description, and then click Save.

  5. Click the toggle button to enable the Privacera Access Management for Starburst Enterprise.

    You will see this message, Save the setting to start controlling access on Starburst Enterprise.

  6. Click Save.

Starburst Enterprise Presto
Starburst Enterprise Presto

Starburst Enterprise platform (SEP) is a commercial distribution of PrestoSQL. It includes additional security features, more connectors, and a cost-based query optimizer not available in the open source version.

As with open source PrestoSQL, SEP is designed to support an external Apache Ranger. This can be configured in the following independent ways:

  1. System-level: Configure SEP so that resource policies defined in PrivaceraCloud under the privacera_starburstenterprisepresto resource service control access to Starburst resources.

  2. System-Plus-Hive: Configure SEP so that resource policies defined in PrivaceraCloud under both the privacera_starburstenterprisepresto and privacera_hive resource services control access to Starburst resources;

    This configuration requires two additional configuration files.

Create a SEP service user

Create a service-user identity that will be used authenticate to your PrivaceraCloud account from the SEP

  1. Go to Access Manager > Users/Groups/Roles, and then create a user. Record the user name. This will be referred to as "${RANGER_API_USERNAME}" in the SEP configuration steps.

  2. Set the Role to Admin and record the password. This will be referred to as "${RANGER_API_PSWD}" in the SEP configuration steps.

Get the account specific API URL
  1. Go to Settings > API Key, and then click GENERATE API KEY.

  2. In the Generate Api Key dialog, set the purpose to REST API Access or similar, and then click the Never Expires check box.

  3. Click the GENERATE API KEY* button.

  4. Click the Copy Url button, and then click Close. Paste and store the URL value. This will be referred to as variable "${RANGER_URL} in the steps that follow.

The API Key page will display the added Api Key.

The Ranger Admin URL (${RANGER_URL}) will look similar to:

https://api.privaceracloud.com/api/13afxxxxxx6b981fxxxxxx2dc7cdd7xxxxxxa921636xxxxxx2d189d425b5f01

A full URL Ranger API service URI is:

<RangerAdminURL>/service/<Ranger API Resource Path>.

Connect application

Use the following steps to connect Starburst Enterprise Presto application to the PrivaceraCloud for Privacera Access Management.

  1. Go to Settings > Applications.

  2. On the Applications screen, select StarburstStarburst Presto.

  3. Enter the application Name and Description, and then click Save.

  4. Click the toggle button to enable the Privacera Access Management for Starburst Enterprise Presto.

    You will see this message: Save the setting to start controlling access on Starburst Enterprise Presto.

  5. Click Save.

The starburst-enterprise-presto service will be available in the Access Manager > Resource Policies section.

Configure Starburst Enterprise (SEP) to use your Account PrivaceraCloud Ranger
  1. SSH to the Hadoop cluster.

  2. Use the following sequence of commands, and using wget, download and extract starburst presto v350 jar.

    mkdir downloads
    cd downloads
    wget https://s3.us-east-2.amazonaws.com/software.starburstdata.net/350e/350-e.3/starburst-presto-server-350-e.3.tar.gz -O presto-server.tar.gz presto-server.tar.gz 
    tar zxvf presto-server.tar.gz
    mv presto-server-350-e.3 presto-server
    cd presto-server
  3. Create a folder etc in which you need to create files and edit them to add the necessary properties.

    mkdir etc
    cd etc/
  4. Create an SSL truststore to communicate with Apache Ranger. The chmod command is used to change permission of the ranger.jceks file.

    hadoop credential create sslTrustStore -value changeit -provider localjceks://file/home/hadoop/downloads/presto-server/etc/ranger.jceks
    chmod a+r /home/hadoop/downloads/presto-server/etc/ranger.jceks
                                     
  5. Create a catalog directory in which you need to create a hive.properties, so that you can use hive as a catalog for query.

    mkdir catalog
  6. Change the default java interpreter on your cluster. By default it will be set to java 8, change it to java 11.

    sudo update-alternatives --config java   ---- Select one with Java 11

Now, in the folder etc, you can start configuring properties.

All the following files must be configured:

File

Standard location

Use

hive.properties

etc/catalog

Global Hive properties

config.properties

etc

Points to plugin configuration files

access-control-privacera.properties

etc

Values for Privacera access control

ranger-policymgr-ssl.xml

etc

Values for Ranger Policy Manager

ranger-hive-audit.xml

etc

Values for Ranger Hive and Audit

access-control-priv-hive.properties

etc

Values for Hive Policies (used only for "System-Plus-Hive" configuration)

  1. Create the hive.properties file.

    a. Use the following command to create hive.properties file in the ${PRESTO_CONFIG_PATH}/etc/catalog/ folder:

    vi hive.properties

    b. Add the following content and save this file.

    hive.metastore=glue
    hive.security=allow-all
  2. Create an access-control-privacera.properties file.

    a. Use the following command to create access-control-privacera.properties in the ${PRESTO_CONFIG_PATH}/etc/ folder:

    vi access-control-privacera.properties

    b. Add the following content, substituting the values for ${RANGER_URL}, ${RANGER_API_USERNAME}, and ${RANGER_API_PSWD}, as they are referenced in the text below.

    Substitute values for ${PRESTO_CONFIG_PATH} and ${PRESTO_TEMP_DIRECTORY} should correct for your environment.

    access-control.name=privacera-starburst
    ranger.policy-rest-url=https://${RANGER_URL}
    ranger.service-name=privacera_starburstenterprisepresto
    ranger.presto-plugin-username=${Ranger API username}
    ranger.presto-plugin-password=${Ranger API user password}
    ranger.policy-refresh-interval=3s
    # Example: ranger.config-resources=/usr/presto-server-341-e/etc/ranger-hive-audit.xml
    ranger.config-resources=${PRESTO_CONFIG_PATH}/etc/ranger-hive-audit.xml
    # Example: ranger.policy-cache-dir=/tmp/ranger
    ranger.policy-cache-dir=${PRESTO_TEMP_DIRECTORY}
    ranger.plugin-policy-ssl-config-file=${PRESTO_CONFIG_PATH}/etc/ranger-policymgr-ssl.xml

    c. Save this file.

  3. Create a ranger-policymgr-ssl.xml file.

    a. Use the following command to create ranger-policymgr-ssl.xml file in the ${PRESTO_CONFIG_PATH}/etc/ folder.

    vi ranger-policymgr-ssl.xml

    b. Add the following XML tags:

    ```xml
    <?xml version="1.0" encoding="UTF-8"?>
    <configuration>
        <property>
            <name>xasecure.policymgr.clientssl.truststore</name>
            <value>${JAVA_PATH}/lib/security/cacerts</value>
        </property>
        <property>
            <name>xasecure.policymgr.clientssl.truststore.password</name>
            <value>crypted</value>
        </property>
        <property>
            <name>xasecure.policymgr.clientssl.truststore.credential.file</name>
            <value>jceks://file/home/hadoop/downloads/presto-server/etc/ranger.jceks</value>
        </property>
    </configuration>
    ```
  4. Create ranger-hive-audit.xml file.

    a. Use the following command to create ranger-hive-audit.xml file in the ${PRESTO_CONFIG_PATH}/etc/ folder.

    vi ranger-hive-audit.xml

    b. Add the following XML tags and substitute ${RANGER_URL} where used.

    ```xml
    <?xml version="1.0" encoding="UTF-8"?>
        <configuration>
            <property>
                <name>ranger.plugin.hive.service.name</name>
                <value>privacera_hive</value>
            </property>
            <property>
                <name>ranger.plugin.hive.policy.pollIntervalMs</name>
                <value>5000</value>
            </property>
            <property>
                <name>ranger.service.store.rest.url</name>
                <value>
                    https://${RANGER_URL}
                </value>
            </property>
            <property>
                <name>ranger.plugin.hive.policy.rest.url</name>
                <value>
                    https://${RANGER_URL}
                </value>
            </property>
            <property>
                <name>ranger.plugin.hive.policy.source.impl</name>
                <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>
                <description>
                    Class to retrieve policies from the source
                </description>
            </property>
            <property>
                <name>ranger.plugin.hive.policy.rest.ssl.config.file</name>
                <value>/home/hadoop/downloads/presto-server/etc/ranger-policymgr-ssl.xml</value>
                <description>
                    Path to the file containing SSL details to contact Ranger Admin
                </description>
            </property>
            <property>
                <name>ranger.service.store.rest.ssl.config.file</name>
                <value>/home/hadoop/downloads/presto-server/etc/ranger-policymgr-ssl.xml</value>
            </property>
            <property>
                <name>ranger.plugin.hive.policy.cache.dir</name>
                <value>/tmp/ranger</value>
                <description>
                    Directory where Ranger policies are cached after successful retrieval from the source
                </description>
            </property>
            <property>
                <name>ranger.plugin.starburst-enterprise-presto.policy.cache.dir</name>
                <value>/tmp/ranger</value>
                <description>
                    Directory where Ranger policies are cached after successful retrieval from the source
                </description>
            </property>
            <property>
                <name>xasecure.audit.destination.solr</name>
                <value>true</value>
            </property>
            <property>
                <name>xasecure.audit.destination.solr.batch.filespool.dir</name>
                <value>presto temp file location</value>
            </property>
            <property>
                <name>xasecure.audit.destination.solr.urls</name>
                <value>
                    https://${RANGER_AUDIT_URL}
                </value>
            </property>
            <property>
                <name>xasecure.audit.is.enabled</name>
                <value>true</value>
            </property>
            <property>
                <name>xasecure.audit.solr.is.enabled</name>
                <value>true</value>
            </property>
            <property>
                <name>xasecure.audit.solr.async.max.queue.size</name>
                <value>1</value>
            </property>
            <property>
                <name>xasecure.audit.solr.async.max.flush.interval.ms</name>
                <value>1000</value>
            </property>
        </configuration>
        ```
  5. Create access-control-priv-hive.properties files.

    a. Use the following command to create access-control-priv-hive.properties in the ${PRESTO_CONFIG_PATH}/etc/ folder:

    vi access-control-priv-hive.properties

    b. Add the following content, substituting the values for ${RANGER_URL}, ${RANGER_API_USERNAME}, and ${RANGER_API_PSWD}, as they are referenced in the text below.

    access-control.name=privacera
    ranger.policy-rest-url=https://${RANGER_URL}
    ranger.service-name=privacera_hive
    privacera.catalogs=hive
    ranger.presto-plugin-username=${RANGER_API_USERNAME}
    ranger.presto-plugin-password=${RANGER_API_PSWD}
    ranger.policy-refresh-interval=3s
    # Example: ranger.config-resources=/usr/presto-server-341-e/etc/ranger-hive-audit.xml
    ranger.config-resources={PRESTO_CONFIG_PATH}/etc/ranger-hive-audit.xml
    # Example: ranger.policy-cache-dir=/tmp/ranger
    ranger.policy-cache-dir=${PRESTO_TEMP_DIRECTORY}
    # Fallback allow-all allows privacera_starburst catalog-level permissions as fallback
    privacera.fallback-access-control=allow-all
    ranger.plugin-policy-ssl-config-file={PRESTO_CONFIG_PATH}/etc/ranger-policymgr-ssl.xml
    ranger.enable-row-filtering=true

    If configuring for System-Level only, do not create this file, because you have already done the "System-Level" configuration in access-control-privacera.properties file.

  6. Create config.properties file.

    a. Use the following command to create config.properties file in the ${PRESTO_CONFIG_PATH}/etc/ folder:

    vi config.properties

    If configuring for System-Level, add the following to this file:

    access-control.config-files=etc/access-control-privacera.properties

    If configuring for System-Plus-Hive, add the following to this file (note that this is a single line):

    access-control.config-files=etc/access-control-privacera.properties,etc access-control-priv-hive.properties
  7. Restart Starburst.

Trino

This topic describes how to connect the Trino application, obtain account-specific scripts from your PrivaceraCloud account, and configure the Trino plug-In.

Connect application
  1. Go to Settings > Applications.

  2. On the Applications screen, select Trino.

  3. Enter the application Name and Description, and then click Save.

    You can see Privacera Access Management and Data Discovery with toggle buttons.

    Note

    If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account

Enable Privacera Access Management

You only need to enable Privacera Access Management to start controlling access on Trino.

  1. Click the toggle button to enable the Privacera Access Management for your application.

    You will see this message, Save the setting to start controlling access on Trino.

  2. Click Save.

Enable Data Discovery
  1. Click the toggle button to enable Data Discovery for your application.

  2. On the BASIC tab, enter values in the following fields.

    • JDBC URL - jdbc:trino://<host>:<port>/<catalog>

      The following three databases can be added as catalog on Trino server:

      • MySQL

      • Oracle

      • PostgreSQL

    • JDBC Username 

    • JDBC Password

  3. On the ADVANCED tab, you can add custom properties.

  4. Using the IMPORT PROPERTIES button, you can browse and import application properties.

  5. Click the TEST CONNECTION button to check if the connection is successful, and then click Save.

    To add a resources using this connection as Discovery targets, see Privacera Discovery scan targets.

Deploy Privacera plug-In in Trino
Obtain installation script

Obtain the account unique <privacera-plugin-script-download-url>. This script and other commands run in your Trino command shell to complete the PrivaceraCloud installation.

Steps:

  1. Go to Settings > API Key.

  2. Use an existing Active API Key or generate a new one.

  3. Click the info icon (i). The Api Key Info page appears.

  4. On the Plugins Setup Script, click the COPY URL button. Save this value on your Trino server. It is needed as the <privacera-plugin-script-download-url> in the next step.

Configure plug-In
  1. In the command shell on your Trino server, run the following commands:

    export PLUGIN_TYPE="trino"
  2. Configure Trino home folder.

    export TRINO_HOME_FOLDER="/opt/privacera/trino-server" 
    #saving privacera_plugin.sh
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
  3. Change directory to where you saved privacera_plugin.sh

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh

    This completes the installation.

Validate Installation

In PrivaceraCloud, open Access Manager > Audit, and click the PLUGIN tab. Look for audit items reporting Plugin Id for Trino and the status "Policies synced to plugin. This indicates that your Trino resource is connected.