- PrivaceraCloud Release 4.5
- PrivaceraCloud User Guide
- PrivaceraCloud
- What is PrivaceraCloud?
- Getting Started with Privacera Cloud
- User Interface
- Dashboard
- Access Manager
- Discovery
- Usage statistics
- Encryption and Masking
- Privacera Encryption core ideas and terminology
- Encryption Schemes
- Encryption Schemes
- System Encryption Schemes Enabled by Default
- View Encryption Schemes
- Formats, Algorithms, and Scopes
- Record the Names of Schemes in Use and Do Not Delete Them
- System Encryption Schemes Enabled by Default
- Viewing the Encryption Schemes
- Formats, Algorithms, and Scopes
- Record the Names of Schemes in Use and Do Not Delete Them
- Encryption Schemes
- Presentation Schemes
- Masking schemes
- Create scheme policies on PrivaceraCloud
- Encryption formats, algorithms, and scopes
- Deprecated encryption formats, algorithms, and scopes
- PEG REST API on PrivaceraCloud
- PEG API Endpoint
- Request Summary for PrivaceraCloud
- Prerequisites
- Anatomy of a PEG API endpoint on PrivaceraCloud
- About constructing the datalist for /protect
- About deconstructing the response from /unprotect
- Example of data transformation with /unprotect and presentation scheme
- Example PEG REST API endpoints for PrivaceraCloud
- Audit details for PEG REST API accesses
- Make calls on behalf of another user on PrivaceraCloud
- Privacera Encryption UDF for masking in Databricks
- Privacera Encryption UDFs for Trino
- Syntax of Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera Crypto plug-in for Trino
- Variable values to obtain from Privacera
- Determine required paths to crypto jar and crypto.properties
- Download Privacera Crypto Jar
- Set variables in Trino etc/crypto.properties
- Restart Trino to register the Privacera Crypto UDFs for Trino
- Example queries to verify Privacera-supplied UDFs
- Azure AD setup
- Launch Pad
- Settings
- General functions in PrivaceraCloud settings
- Applications
- About applications
- Azure Data Lake Storage Gen 2 (ADLS)
- Athena
- Privacera Discovery with Cassandra
- Databricks
- Databricks SQL
- Dremio
- DynamoDB
- Elastic MapReduce from Amazon
- EMRFS S3
- Files
- File Explorer for Google Cloud Storage
- Glue
- Google BigQuery
- Kinesis
- Lambda
- Microsoft SQL Server
- MySQL for Discovery
- Open Source Spark
- Oracle for Discovery
- PostgreSQL
- Power BI
- Presto
- Redshift
- Redshift Spectrum
- Kinesis
- Snowflake
- Starburst Enterprise with PrivaceraCloud
- Starburst Enterprise Presto
- Trino
- Datasource
- User Management
- API Key
- About Account
- Statistics
- Help
- Apache Ranger API
- Reference
- Okta Setup for SAML-SSO
- Azure AD setup
- SCIM Server User-Provisioning
- AWS Access with IAM
- Access AWS S3 buckets from multiple AWS accounts
- Add UserInfo in S3 Requests sent via Dataserver
- EMR Native Ranger Integration with PrivaceraCloud
- Spark Properties
- Operational Status
- How-to
- Create CloudFormation Stack
- Enable Real-time Scanning of S3 Buckets
- Enable Discovery Realtime Scanning Using IAM Role
- How to configure multiple JSON Web Tokens (JWTs) for EMR
- Enable offline scanning on Azure Data Lake Storage Gen 2 (ADLS)
- Enable Real-time Scanning on Azure Data Lake Storage Gen 2 (ADLS)
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- PrivaceraCloud
- PrivaceraCloud Previews
- Privacera documentation changelog
Applications
About applications
This section contains how to connect, edit, and delete applications.
Terminology
A datasource is a collection of data stored in a third-party application such as Microsoft SQL, AWS S3, or Databricks. PrivaceraCloud integrates with your datasource to control access or scan for sensitive data.
Datasources are organized into applications.
An application is a configuration for a data resource or authentication resource to be linked to your PrivaceraCloud account.
You provide application target and type-specific properties for the target resource such as location address (URL) and authentication credentials to that resource.
For some applications, you can add custom properties using a key/value pair syntax.
The properties can be exported to a JSON properties file. This file can then be reimported at a later time or can be used as a template for other applications.
An authentication resource can be a connection to directory service for data access users or for portal users.
Note
You can only use one dataserver setup per account for Privacera Access Management.
Connect an application
Go to Settings > Applications.
In the Applications section, select the application you wish to connect.
Enter the application Name and Description.
Click SAVE to save the changes or CANCEL to discard them.
View connection status
To view the status of the connection to an application:
Go to Settings > Applications.
Click the name of the previously connected application.
Look under the Access Management column:
Red: there is a problem with the connection.
Green: Successful completion of the connection.
Edit application name and description
Go to Settings > Applications .
Select the application you wish to edit.
Under the Action column, click the pen icon.
Change the Name or Description.
Click SAVE to save the changes or CANCEL to discard them.
Delete application
Go to Settings > Applications.
Select the application you wish to delete.
Under the Action column, click the trash can icon.
Carefully read the warning in the popup.
To verify that you want to delete the application, type
delete
in the text box.Click DELETE to delete the application or CANCEL to discard the changes.
Azure Data Lake Storage Gen 2 (ADLS)
This topic describes how to connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud.
Prerequisites
Before connecting the Azure Data Lake Storage Gen 2 (ADLS) application, make sure you have the following information available:
Azure Data Lake Storage Gen 2 (ADLS) Storage Account ID
Azure Data Lake Storage Gen 2 (ADLS) Account Storage Key
Note
You can only use one Azure Data Lake Storage Gen 2 (ADLS) setup per PrivaceraCloud account for Privacera Access Management.Connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud
To connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud:
Go the Settings > Applications.
In the Applications screen, select Azure Data Lake Storage Gen 2 (ADLS).
Enter the application Name and Description, and then click Save.
Click the toggle to enable Access Management for Azure Data Lake Storage Gen 2 (ADLS).
On the BASIC tab, enter the values in the following fields:
Azure Data Lake Storage Gen 2 (ADLS) Storage Account
Azure Data Lake Storage Gen 2 (ADLS) Storage Key
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
After the service is established, you can configure your local Azure CLI to redirect requests to the PrivaceraCloud Azure ADLS Data Server proxy. For more information, see Scripts for AWS CLI or Azure CLI for managing connected applications.
Athena
This topic describes how to connect Athena to PrivaceraCloud.
Prerequisites in AWS console
Before connecting Athena to PrivaceraCloud for Privacera Access Management, make sure that only one Privacera dataserver.
In your AWS console:
Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy.
Configure a trust relationship with PrivaceraCloud. See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.
Save the ARN, which you need to set in PrivaceraCloud in the following steps.
To verify the connection of Athena, Privacera recommends that you install the AWS CLI. Install and configure the AWS CLI on your sytem so that it uses the PrivaceraCloud S3 Data Server proxy.
Connect Athena with IAM role and trust relationship
Go to Setting > Applications.
Select Athena.
Enter the application Name and Description.
Click Save.
Click the toggle to enable Access Management for the application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key
AWS Account Secret Key: AWS data repository host account Secret Key
AWS_ATHENA_RESULT_STORAGE_URL: Query results storage bucket URL
Click Save.
With Use IAM Role enabled, enter values for the following fields:
AWS IAM Role
AWS IAM Role External Id
AWS_ATHENA_RESULT_STORAGE_URL: Query results storage bucket URL
Click Save.
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Recommended: Validate connectivity by running the AWS CLI for Athena queries such as the following:
aws athena start-query-execution --query-string "SHOW DATABASES"
Privacera Discovery with Cassandra
This topic describes how to connect Cassandra to the PrivaceraCloud Discovery service.
Prerequisites
Before connecting the Cassandra application, make sure you have the following information available:
JDBC URL
JDBC Username
JDBC Password
Connect application
Go to Settings > Applications.
Select Cassandra.
Enter the application Name and Description.
Click Save.
Click the toggle button to enable Data Discovery for Cassandra.
In the BASIC tab, enter the values in the following fields:
JDBC URL
JDBC Username
JDBC Password
On the ADVANCED tab, you can add custom properties.
Click IMPORT PROPERTIES to browse and import application properties.
Click TEST CONNECTION to check if the connection is successful.
Click Save.
Define scan targets
To define Privacera Discovery scan targets for this application, see Privacera Discovery scan targets.
Databricks
The topic describes how to connect Databricks application to PrivaceraCloud using AWS and Azure platforms. Privacera provides Spark Fine-Grained Access Control plug-in [FGAC] and Spark Object-Level Access Control plug-in [OLAC] plugin solutions for access control in Databricks clusters. Both plugins are mutually exclusive and cannot be enabled on the same cluster.
Go the Setting > Applications.
In the Applications screen, select Databricks.
Select the platform type (AWSor Azure) on which you want to configure the Databricks application.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Access Management for Databricks.
Databricks Spark Fine-Grained Access Control plug-in [FGAC]
PrivaceraCloud integrates with Databricks SQL using the Plug-In integration method with an account-specific cluster-scoped initialization script. Privacera’s Spark plug-In will be installed on the Databricks cluster enabling Fine-Grained Access Control. This script will be added it to your cluster as an init script to run at cluster startup. As your cluster is restarted, it runs the init script and connects to PrivaceraCloud.
Note
Accounts upgrading from PrivaceraCloud 2.0 to PrivaceraCloud 2.1 and intending to use Privacera Encryption with Databricks must re-install the init script to Databricks.
Ensure that the following prerequisites are met:
You must have an existing Databricks account and login credentials with sufficient privileges to manage your Databricks cluster.
PrivaceraCloud portal admin user access.
This setup is recommended for SQL, Python, and R language notebooks.
It provides FGAC on databases with row filtering and column masking features.
It uses privacera_hive, privacera_s3, privacera_adls, privacera_files services for resource-based access control, and privacera_tag service for tag-based access control.
It uses the plugin implementation from Privacera.
Log in to the PrivaceraCloud portal as an admin user (role ROLE_ACCOUNT_ADMIN).
Generate the new API and Init Script. For more information, see API Key.
On the Databricks Init Script section, click DOWNLOAD SCRIPT.
By default, this script is named
privacera_databricks.sh
. Save it to a local filesystem or shared storage.Log in to your Databricks account using credentials with sufficient account management privileges.
Copy the Init script to your Databricks cluster. This can be done via the UI or using the Databricks CLI.
Using the Databricks UI:
On the left navigation, click the Data icon.
Click the Add Data button from the upper right corner.
In the Create New Table dialog, select Upload File, and then click browse.
Select
privacera_databricks.sh
, and then click Open to upload it.Once the file is uploaded, the dialog will display the uploaded file path. This filepath will be required in the later step.
The file will be uploaded to
/FileStore/tables/privacera_databricks.sh
path, or similar.
Using the Databricks CLI, copy the script to a location in DBFS:
databricks fs cp ~/<sourcepath_privacera_databricks.sh> dbfs:/<destinaton_path>
For example:
databricks fs cp ~/Downloads/privacera_databricks.sh dbfs:/FileStore/tables/
You can add PrivaceraCloud to an existing cluster, or create a new cluster and attach PrivaceraCloud to that cluster.
a. In the Databricks navigation panel select Clusters.
b. Choose a cluster name from the list provided and click Edit to open the configuration dialog page.
c. Open Advanced Options and select the Init Scripts tab.
d. Enter the DBFS init script path name you copied earlier.
e. Click Add.
f. From Advanced Options, select the Spark tab. Add the following Spark configuration content to the Spark Config edit window. For more information on the properties, see Spark Configuration Table Properties.
New Properties:
spark.databricks.isv.productprivacera spark.databricks.cluster.profileserverless spark.databricks.delta.formatCheck.enabledfalse spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar spark.databricks.repl.allowedLanguagessql,python,r
Old Properties:
spark.databricks.isv.productprivacera spark.databricks.cluster.profileserverless spark.databricks.delta.formatCheck.enabledfalse spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar spark.databricks.repl.allowedLanguagessql,python,r
Note
From PrivaceraCloud release 4.1.0.1 and later, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.
For Databricks versions <=8.2, Old Properties should only be used since the versions are in extended support.
If you are upgrading the Databricks Runtime from an existing version (6.4-8.2) to a version 8.3 and higher, contact Privacera technical sales representative for assistance.
Restart the Databricks cluster.
Notice
To enable View Level Access Control, View Level Column Masking, and View Level Row Filtering, refer to ??? By default these features are disabled.
Confirm connectivity by executing a simple data access sequence and then examining the PrivaceraCloud audit stream.
You will see corresponding events in the Access Manager > Audits.
Example data access sequence:
Create or open an existing Notebook. Associate the Notebook with the Databricks cluster you secured in the steps above.
Run an SQL show tables command in the Notebook:
sql show tables ;
On PrivaceraCloud, go to Access Manager > Audits to view the monitored data access.
Create a Deny policy, run this same SQL access sequence a second time, and confirm corresponding Denied events.
Databricks Spark Object-Level Access Control plug-in [OLAC]
This section outlines the steps needed to setup Object-Level Access Control (OLAC) in Databricks clusters. This setup is recommended for Scala language notebooks.
It provides OLAC on S3 locations accessed via Spark.
It uses privacera_s3 service for resource-based access control and privacera_tag service for tag-based access control.
It uses the signed-authorization implementation from Privacera.
Note
If you are using SQL, Python, and R language notebooks, recommendation is to use FGAC. See the Databricks Spark Fine-Grained Access Control plug-in [FGAC] section above.
OLAC and FGAC methods are mutually exclusive and cannot be enabled on the same cluster.
OLAC plugin was introduced to provide an alternative solution for Scala language clusters, since using Scala language on Databricks Spark has some security concerns.
Ensure that the following prerequisites are met:
You must have an existing Databricks account and login credentials with sufficient privileges to manage your Databricks cluster.
PrivaceraCloud portal admin user access.
Note
For working with Delta format files, configure the AWS S3 application using IAM role permissions.
Create a new AWS S3 Databricks connection. For more information, see Create S3 application.
After creating an S3 application.
In the BASIC tab, provide Access Key, Secret Key, or an IAM Role. For more information, see Create S3 application.
In the ADVANCED tab, add the following property:
dataserver.databricks.allowed.urls=<DATABRICKS_URL_LIST>
where
<DATABRICKS_URL_LIST>
: Comma-separated list of the target Databricks cluster URLs.For example:
dataserver.databricks.allowed.urls=https://dbc-yyyyyyyy-xxxx.cloud.databricks.com/
.Click Save.
If you are updating an S3 application:
Go to Settings > Applications > S3, and click the pen icon to edit properties.
Click the toggle button of a service you wish to enable.
In the ADVANCED tab, add the following property:
dataserver.databricks.allowed.urls=<DATABRICKS_URL_LIST>
where
<DATABRICKS_URL_LIST>
: Comma-separated list of the target Databricks cluster URLs. For example,dataserver.databricks.allowed.urls=https://dbc-yyyyyyyy-xxxx.cloud.databricks.com/
.Save your configuration.
Download the Databricks init script.
Log in to the PrivaceraCloud portal.
Generate the new API and Init Script. For more information, refer to the topic API Key.
On the Databricks Init Script section, click the DOWNLOAD SCRIPT button.
By default, this script is named
privacera_databricks.sh
. Save it to a local filesystem or shared storage.
Upload the Databricks init script to your Databricks clusters.
Log in to your Databricks cluster using administrator privileges.
On the left navigation, click the Data icon.
Click Add Data from the upper right corner.
From the Create New Table dialog box select Upload File, then select and open
privacera_databricks.sh
.Copy the full storage path onto your clipboard.
Add the Databricks init script to your target Databricks clusters:
In the Databricks navigation panel select Clusters.
Choose a cluster name from the list provided and click Edit to open the configuration dialog page.
Open Advanced Options and select the Init Scripts tab.
Enter the DBFS init script path name you copied earlier.
Click Add.
From Advanced Options, select the Spark tab. Add the following Spark configuration content to the Spark Config edit window. For more information on the properties, see Spark Configuration Table Properties.
New Properties
spark.databricks.isv.productprivacera spark.databricks.repl.allowedLanguagessql,python,r,scala spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar spark.executor.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar spark.databricks.delta.formatCheck.enabledfalse
Add the following property in the Environment Variables text box:
PRIVACERA_PLUGIN_TYPE=OLAC
Old Properties
spark.databricks.isv.product privacera spark.databricks.repl.allowedLanguagessql,python,r,scala spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar spark.hadoop.fs.s3.implcom.databricks.s3a.PrivaceraDatabricksS3AFileSystem spark.hadoop.fs.s3n.implcom.databricks.s3a.PrivaceraDatabricksS3AFileSystem spark.hadoop.fs.s3a.implcom.databricks.s3a.PrivaceraDatabricksS3AFileSystem spark.executor.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar spark.hadoop.signed.url.enable true
Save and close.
Restart the DatabricksCluster.
Note
From PrivaceraCloud release 4.1.0.1 onwards, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.
For Databricks versions <= 8.2, Old Properties should only be used since the versions are in extended support.
If you are upgrading the Databricks Runtime from an existing version (6.4-8.2) to a version 8.3 and higher, contact Privacera technical sales representative for assistance.
Your S3 Databricks cluster data resource is now available for Access Manager Policy Management, under Access Manager > Resource Policies, Service "privacera_s3".
Databricks cluster deployment matrix with Privacera plugin:
Job/Workflow use-case for automated cluster:
Run-Now will create the new cluster based on the definition mentioned in the job description.
Job Type | Languages | FGAC/DBX version | OLAC/DBX Version |
---|---|---|---|
Notebook | Python/R/SQL | Supported [7.3, 9.1 , 10.4] | |
JAR | Java/Scala | Not supported | Supported[7.3, 9.1 , 10.4] |
spark-submit | Java/Scala/Python | Not supported | Supported[7.3, 9.1 , 10.4] |
Python | Python | Supported [7.3, 9.1 , 10.4] | |
Python wheel | Python | Supported [9.1 , 10.4] | |
Delta Live Tables pipeline | Not supported | Not supported |
Job on existing cluster:
Run-Now will use the existing cluster which is mentioned in the job description.
Job Type | Languages | FGAC/DBX version | OLAC |
---|---|---|---|
Notebook | Python/R/SQL | supported [7.3, 9.1 , 10.4] | Not supported |
JAR | Java/Scala | Not supported | Not supported |
spark-submit | Java/Scala/Python | Not supported | Not supported |
Python | Python | Not supported | Not supported |
Python wheel | Python | supported [9.1 , 10.4] | Not supported |
Delta Live Tables pipeline | Not supported | Not supported |
Interactive use-case
Interactive use-case is running a notebook of SQL/Python on an interactive cluster.
Cluster Type | Languages | FGAC | OLAC |
---|---|---|---|
Standard clusters | Scala/Python/R/SQL | Not supported | Supported [7.3,9.1,10.4] |
High Concurrency clusters | Python/R/SQL | Supported [7.3,9.1,10.4 | Supported [7.3,9.1,10.4] |
Single Node | Scala/Python/R/SQL | Not supported | Supported [7.3,9.1,10.4] |
Access AWS S3 using Boto3 from Databricks
This section describes how to use the AWS SDK (Boto3) for PrivaceraCloud to access AWS S3 file data through a Privacera DataServer proxy.
The following commands must be run in a notebook for Databricks:
Install the AWS Boto3 libraries
pip install boto3
Import the required libraries
import boto3
Access the AWS S3 files
def check_s3_file_exists(bucket, key, access_key, secret_key, endpoint_url, dataserver_cert, region_name): exec_status = False access_key = access_key secret_key = secret_key endpoint_url = endpoint_url try: s3 = boto3.resource(service_name='s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key, endpoint_url=endpoint_url, region_name=region_name) print(s3.Object(bucket_name=bucket, key=key).get()['Body'].read().decode('utf-8')) exec_status = True except Exception as e: print("Got error: {}".format(e)) finally: return exec_status def read_s3_file(bucket, key, access_key, secret_key, endpoint_url, dataserver_cert, region_name): exec_status = False access_key = access_key secret_key = secret_key endpoint_url = endpoint_url try: s3 = boto3.client(service_name='s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key, endpoint_url=endpoint_url, region_name=region_name) obj = s3.get_object(Bucket=bucket, Key=key) print(obj['Body'].read().decode('utf-8')) exec_status = True except Exception as e: print("Got error: {}".format(e)) finally: return exec_status readFilePath = "file data/data/format=txt/sample/sample_small.txt" bucket = "infraqa-test" #saas access_key = "${privacera_access_key}" secret_key = "${privacera_secret_key}" endpoint_url = "https://ds.privaceracloud.com" dataserver_cert = "" region_name = "us-east-1" print(f"got file===== {readFilePath} ============= bucket= {bucket}") status = check_s3_file_exists(bucket, readFilePath, access_key, secret_key, endpoint_url, dataserver_cert, region_name)
Access Azure file using Azure SDK from Databricks
This section describes how to use the Azure SDK for PrivaceraCloud to access Azure DataStorage/Datalake file data through a Privacera DataServer proxy.
The following commands must be run in a notebook for Databricks:
Install the Azure SDK libraries
pip install azure-storage-file-datalake
Import the required libraries
import os, uuid, sys from azure.storage.filedatalake import DataLakeServiceClient from azure.core._match_conditions import MatchConditions from azure.storage.filedatalake._models import ContentSettings
Initialize the account storage through connection string method
def initialize_storage_account_connect_str(my_connection_string): try: global service_client print(my_connection_string) service_client = DataLakeServiceClient.from_connection_string(conn_str=my_connection_string, headers={'x-ms-version': '2020-02-10'}) except Exception as e: print(e)
Prepare the connection string
def prepare_connect_str(): try: connect_str = "DefaultEndpointsProtocol=https;AccountName=${privacera_access_key}-{storage_account_name};AccountKey=${base64_encoded_value_of(privacera_access_key|privacera_secret_key)};BlobEndpoint=https://ds.privaceracloud.com;" # sample value is shown below #connect_str = "DefaultEndpointsProtocol=https;AccountName=MMTTU5Njg4Njk0MDAwA6amFpLnBhdGVsOjE6MTY1MTU5Njg4Njk0MDAw==-pqadatastorage;AccountKey=TVRVNUTU5Njg4Njk0MDAwTURBd01UQTZhbUZwTG5CaGRHVnNPakU2TVRZMU1URTJOVGcyTnpVMTU5Njg4Njk0MDAwVZwLzNFbXBCVEZOQWpkRUNxNmpYcjTU5Njg4Njk0MDAwR3Q4N29UNFFmZWpMOTlBN1M4RkIrSjdzSE5IMFZic0phUUcyVHTU5Njg4Njk0MDAwUxnPT0=;BlobEndpoint=https://ds.privaceracloud.com;" return connect_str except Exception as e: print(e)
Define a sample access method to get Azure file and directories
def list_directory_contents(connect_str): try: initialize_storage_account_connect_str(connect_str) file_system_client = service_client.get_file_system_client(file_system="{storage_container_name}") #sample values as shown below #file_system_client = service_client.get_file_system_client(file_system="infraqa-test") paths = file_system_client.get_paths(path="{directory_path}") #sample values as shown below #paths = file_system_client.get_paths(path="file data/data/format=csv/sample/") for path in paths: print(path.name + '\n') except Exception as e: print(e)
To verify that the proxy is functioning, call the access methods
connect_str = prepare_connect_str() list_directory_contents(connect_str)
Databricks SQL
Databricks SQL Overview and Configuration
One purpose of PolicySync for Databricks SQL is to limit users access to your entire Databricks data source or portions thereof, such as Delta external tables, views, entire tables, or only certain columns or rows.
Planning and general process
The general process for connecting with JDBC to a Databricks SQL data source, creating policy, and limiting user access is as follows, You should plan to have the necessary information before you begin the specific steps described here.
Add the privacera_tag service.
Create an endpoint in Databricks SQL for PrivaceraCloud to connect to, with JDBC username, password, and URL.
Add Databricks SQL as a service in PrivaceraCloud.
Define a data source for the Databricks SQL endpoint in PrivaceraCloud using the values from the first step and other required fields.
Define the Databricks SQL service.
Determine the users, groups, or roles who need access from PrivaceraCloud to your Databricks SQL.
Ensure that all users in PrivaceraCloud who will access Databricks SQL have an email address in their PrivaceraCloud account.
Define those users with appropriate permissions in Databricks.
Create a resource policy to assign users, groups, or roles the necessary permissions to access the Databricks SQL data source at the appropriate depth.
Decide the depth of the data access you will give to users: views, source tables, columns, or rows. See Allowable Privileges.
Prerequisites
Make sure the Privacera Tag Service and Databricks SQL Endpoint configuration are updated before you configure Databricks SQL PolicySync.
In PrivaceraCloud, the administrator must add the privacera_tag service to enable PolicySync with Databricks SQL.
See the steps in Adding the privacera_tag Service.
In Databricks SQL, an administrator must create a Databricks SQL endpoint for connecting from PrivaceraCloud. This process is described in Create an Endpoint in Databricks SQL.
Make note of the following values for entering into the fields in PrivaceraCloud as detailed in Connect Application and Databricks SQL PolicySync Fields:
The email address of the user defined in the endpoint. This is the value of the JDBC username (Service jdbc username) in PrivaceraCloud.
The Databricks generated access token. This is the value of the JDBC password (Service jdbc password) for the defined JDBC username in PrivaceraCloud.
The JDBC URL (Service jdbc url) defined for the endpoint.
Databricks SQL with Privacera Hive
To use Databricks SQL with Privacera Hive, see Databricks SQL Hive Service Def.
Connect application
With the values for the JDBC username, JDBC password, and JDBC URL that you noted in Create endpoint in Databricks SQL, define the data source connection in PrivaceraCloud to the Databricks SQL endpoint.
Follow these steps to connect the Databricks SQL application to the PrivaceraCloud:
Go the Setting > Applications.
In the Applications screen, select Databricks SQL.
Select the platform type (AWS or Azure) on which you want to configure the Databricks application.
Enter the application Name and Description, and then click Save.
Click the toggle button either to enable the Access Management or Data Discovery for Databricks SQL.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery.
In the BASIC tab, enter values in the fields. For more information on the Fields and it's values, see Databricks SQL PolicySync Fields.
Click Save.
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Grant Databricks SQL permissions to PrivaceraCloud users
For each PrivaceraCloud user that needs access to Databricks SQL, the administrator needs to define that user with appropriate access permissions in Databricks.
All PrivaceraCloud users who will access Databricks SQL must have an email address in their user account on PrivaceraCloud. This email address is required to login to Databricks SQL.
In your Databricks account:
Navigate to Data science and engineering.
Click Workspace on the top right.
To open the Admin Console, go to the top right of the Workspace, click the user account icon, and select Admin Console.
In the Databricks SQL access column, select the checkbox for the user.
In the Databricks SQL Dashboard:
Navigate to SQL > Endpoints
Click the name of the Endpoint for which you want to add user permission.
In the top right, click Permissions.
In the SQL Endpoint Permissions dialog, select the intended user from drop down
Give the user Can Use permission.
Click Add.
Click Save.
Define a resource policy
In PrivaceraCloud, define a resource policy to grant access to the Databricks SQL data source to users, groups, or roles.
Follow the steps in Resource Policies and the details about allowed privileges described here.
The following privileges can be specified for a Databricks SQL resource policy:
SELECT: Allows read access to an object.
CREATE: Provides ability to create an object (for example, a table in a database).
MODIFY: Provides ability to add, delete, and modify data to or from an object.
USAGE: An additional requirement to perform any action on a database object.
READ_METADATA: Provides ability to view an object and its metadata.
CREATE_NAMED_FUNCTION: Provides ability to create a named UDF in an existing catalog or database.
ALL PRIVILEGES: Gives all privileges, equivalent to all the above privileges.
Data_Admin Privilege for Secure Views: With the Data_Admin privilege, access policies are applied to source tables. If you want to restrict the access policies only to the views and not to the source tables, enable the following property in the PolicySync configuration, as detailed in Connect Application and Databricks SQL PolicySync Fields:
Secure view Access by Table policies:
true
Test the policy
To assign privileges to users, groups, or roles, follow the steps in Resource Policies.
This can be tested with a non-administrator user.
Databricks SQL PolicySync fields
For a description of all fields that must or can be set for resource policy, see Databricks SQL PolicySync Fields.
Configuring column-level access control
To enable column-level access control, set the following fields when you define the PolicySync fields:
Column Level Access Control:
true
.In custom fields, add the following, where
# REDACTED #
is any string of your choice:ranger.policysync.connector.4.access.control.number.value=0 ranger.policysync.connector.4.access.control.double.value=0 ranger.policysync.connector.4.access.control.text.value='# REDACTED #'
View-based masking functions and row-level filtering
For supported masking functions and supported row-level filtering, see Databricks SQL Masking Functions.
Create an endpoint in Databricks SQL
Login to your Databricks account as a user with administrative privileges.
After logging into your Databricks, go to SQL Analytics.
Go to Endpoints and click New SQL Endpoint.
Create the endpoint as per your requirement as shown below.
Click the endpoint connection details and note the JDBC URL for configuration with PolicySync.
Click the personal access token to create token.
Click Generate New Token.
Enter the name of the token, specify its validity, and click Generate.
Copy the generated token. This is the JDBC password of the user when connecting from PolicySync, and the email ID of the user is the JDBC username.
Databricks SQL Fields
Basic fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
Databricks SQL jdbc url |
| Yes | Specifies the JDBC URL for the Databricks SQL connector. Use the following format for the JDBC URL: jdbc:spark://<WORKSPACE_URL>:443/<DATABASE>;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/endpoints/1234567890 The workspace URL and the database name are derived from your Databricks SQL configuration. | |
Databricks SQL jdbc username |
| Yes | Specifies the JDBC username to use. | |
Databricks SQL jdbc password |
| Yes | Specifies the access token of the SQL endpoint to use. | |
Databricks SQL default database |
| Yes | Specifies the name of the JDBC database to use. | |
Databricks SQL resource owner |
| No | Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.
The following resource types are supported:
| |
Databricks SQL workspace URL |
| Yes | Specifies the base URL for the Databricks SQL instance. | |
Databases to set access control policies |
| No | Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of databases might resemble the following: If specified, Databases to ignore while setting access control policies takes precedence over this setting. | |
Enable policy enforcements and user/group/role management |
|
| Yes | Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is |
Enable access audits |
|
| Yes | Specifies whether Privacera fetches access audit data from the data source. |
Advanced fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
Tables to set access control policies |
| No | Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive. Use the following format when specifying a table: <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME> If specified, Tables to ignore while setting access control policies takes precedence over this setting. If you specify a wildcard, such as in the following example, all matched tables are managed:
The specified value, if any, is interpreted in the following ways:
| |
Databases to ignore while setting access control policies |
| No | Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control. For example: testdb1,testdb2,sales_db* This setting supersedes any values specified by Databases to set access control policies. | |
Tables to ignore while setting access control policies |
| No | Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format: <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME> This setting supersedes any values specified by Tables to set access control policies. | |
Regex to find special characters in names |
|
| No | Specifies a regular expression to apply to a user name and replaces each matching character with the value specified by the If not specified, no find and replace operation is performed. |
String to replace with the special characters found in names |
|
| No | Specifies a string to replace the characters matched by the regex specified by the If not specified, no find and replace operation is performed. |
Regex to find special characters in user names |
|
| No | Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting. If not specified, no find and replace operation is performed. |
String to replace with the special characters found in user names |
|
| No | Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting. If not specified, no find and replace operation is performed. |
Regex to find special characters in group names |
|
| No | Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting. If not specified, no find and replace operation is performed. |
String to replace with the special characters found in group names |
|
| No | Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting. If not specified, no find and replace operation is performed. |
Regex to find special characters in role names |
|
| No | Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting. If not specified, no find and replace operation is performed. |
String to replace with the special characters found in role names |
|
| No | Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting. If not specified, no find and replace operation is performed. |
Persist case sensitivity of user names |
|
| No | Specifies whether PolicySync converts user names to lowercase when creating local users. If set to |
Persist case sensitivity of group names |
|
| No | Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to |
Persist case sensitivity of role names |
|
| No | Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to |
Create users in Databricks SQL Endpoint by policysync |
|
| No | Specifies whether PolicySync creates local users for each user in Privacera. |
Manage users from portal |
|
| No | Specifies whether PolicySync maintains user membership in roles in the Databricks SQL data source. |
Manage groups from portal |
|
| No | Specifies whether PolicySync creates groups from Privacera in the Databricks SQL data source. |
Manage roles from portal |
|
| No | Specifies whether PolicySync creates roles from Privacera in the Databricks SQL data source. |
Users to set access control policies |
| No | Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive. If not specified, PolicySync manages access control for all users. If specified, Users to be ignored by access control policies takes precedence over this setting. An example user list might resemble the following: | |
Groups to set access control policies |
| No | Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of projects might resemble the following: If specified, Groups be ignored by access control policies takes precedence over this setting. | |
Roles to set access control policies |
| No | Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of projects might resemble the following: If specified, Roles be ignored by access control policies takes precedence over this setting. | |
Users to be ignored by access control policies |
| No | Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control. This setting supersedes any values specified by Users to set access control policies. | |
Groups be ignored by access control policies |
| No | Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control. This setting supersedes any values specified by Groups to set access control policies. | |
Roles be ignored by access control policies |
| No | Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control. This setting supersedes any values specified by Roles to set access control policies. | |
Prefix of Databricks SQL Endpoint roles for portal groups |
|
| No | Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named |
Prefix of Databricks SQL Endpoint roles for portal roles |
|
| No | Specifies the prefix that PolicySync uses when creating roles from Privacera in the Databricks SQL data source. For example, if you have a role in Privacera named |
Use Databricks SQL Endpoint native public group for public group access policies |
|
| No | Specifies whether PolicySync uses the Databricks SQL native public group for access grants whenever a policy refers to a public group. The default value is true. |
Set access control policies only on the users from managed groups |
|
| No | Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false. |
Set access control policies only on the users/groups from managed roles |
|
| No | Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false. |
Use email as service name |
|
| No | This Property is used to map the username to the email address when granting/revoking access. |
Enforce masking policies using secure views |
|
| No | Specifies whether to use secure view based masking. The default value is |
Enforce row filter policies using secure views |
|
| No | Specifies whether to use secure view based row filtering. The default value is While Databricks SQL supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended. |
Create secure view for all tables/views |
|
| No | Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled. |
Default masked value for numeric datatype columns |
|
| No | Specifies the default masking value for numeric column types. |
Default masked value for text/varchar/string datatype columns |
|
| No | Specifies the default masking value for text and string column types. |
Secure view name prefix |
| No | Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name. If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is | |
Secure view name postfix |
| No | Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name. If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is | |
Secure view database name prefix |
| No | Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name. For example, if the prefix is | |
Secure view database name postfix |
|
| No | Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name. For example, if the postfix is |
Enable dataadmin |
|
| No | This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view. |
Users to exclude when fetching access audits |
|
| No | Specifies a comma separated list of users to exclude when fetching access audits. For example: |
Custom fields
Canonical name | Type | Default | Description |
---|---|---|---|
|
|
| Specifies how PolicySync loads resources from Databricks SQL. The following values are allowed:
|
|
|
| Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly. |
|
|
| Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera. |
|
|
| Specifies how user name conversions are performed. The following options are valid:
This setting applies only if Persist case sensitivity of user names is set to |
|
|
| Specifies how group name conversions are performed. The following options are valid:
This setting applies only if Persist case sensitivity of group names is set to |
|
|
| Specifies how role name conversions are performed. The following options are valid:
This setting applies only if Persist case sensitivity of role names is set to |
|
| Specifies a suffix to remove from a table or view name. For example, if the table is named You can specify a single suffix or a comma separated list of suffixes. | |
|
| Specifies a suffix to remove from a database name. For example, if the database is named You can specify a single suffix or a comma separated list of suffixes. | |
|
|
| Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Databricks SQL. |
Databricks SQL Hive Service Definition
To use Databricks SQL with Privacera Hive requires Hive-specific configuration in following steps:
To use Databricks SQL with Privacera Hive, you need to connect Databricks application which internally creates
privacera_hive
. You need to connect the Databricks application, enable access, and save it.Additionally, configure the following properties for Hive when you Connect application.
In the System config field, add the following value:
privacera-databricks_sql_analytics-hive-system-config.json
In the ADVANCED tab, add the following properties. This example uses the number
4
as the connector key.ranger.policysync.connector.4.ranger.service.appid=privacera_hive ranger.policysync.connector.4.ranger.service.name=privacera_hive
Note
Prior to PrivaceraCloud version 4.2, if you have experienced that PolicySync with databricks_sql_analytics or hive service did not handle Ranger user/group/roles updates, add the following property where the number 4
is the connector key. This will push the new users to the Databricks workspace forcefully.
ranger.policysync.connector.4.force.update.principal=true
Hive-to-Databricks SQL Permission Mapping
Hive Permission | Databricks SQL Permission |
---|---|
Select | Usage, ReadMetadata, Select |
Update | Usage, modify |
Create in the database | Usage, Create in the database |
Create on the UDF | Usage, CreateNamedFunction |
Drop | No equivalent |
Alter | No equivalent |
Databricks SQL Masking Functions
Masking Function | Scope in Databricks SQL |
---|---|
Default | Value: Default values given as masked properties Data type: All |
Null | Value: Null Data type: All |
Unmasked | Value: Actual value Data type: All |
Hash DBX | Value: Hashed value Data type: text/varchar |
MASK_MD5 | Value: Hashed value Data type: text/string |
Regex | Value: Replace value Data type: text/string |
Literal Mask | Value: Replace value Data type: text/string |
Partial last 4 characters | Value: Replace value Data type: text/string |
Partial first 4 characters | Value: Replace value Data type: text/string |
Custom | Value: The UDF given as the input. Data type: All. For example, repeat('xy', 5) |
Databricks SQL Encryption
The following steps enable use of Privacera encryption services in a Databricks SQL notebook:
Create a secret shared by Privacera Encryption Gateway (PEG) and Databricks.
Create Resource Policies in Privacera for data access to Databricks SQL resources.
Create Privacera encryption and decryption User-Defined Functions (UDFs) in Databricks.
For more information about Privacera encryption schemes, see the Privacera Encryption Guide.
Prerequisites
A working Databricks SQL installation connected to PrivaceraCloud. See Databricks to learn more.
Databricks CLI installed to your client system and configured to attach to your Databricks host. See Databricks Documentation: Databricks CLI and Databricks Documentation: Authenticating using Databricks personal access tokens.
Privacera Encryption Gateway (PEG) enabled and configured in your account settings. See About Account.
Grant permission in encryption scheme policy
To use Databricks SQL encryption, you must create a scheme policy for a user that will use the Databricks UDF. This scheme policy must grant the getSchemes
permission. See Create Scheme Policies on PrivaceraCloud to learn more.
Configure Databricks
With the Databricks CLI:
Create a secret scope called
privaceracloud
:databricks secrets create-scope --scope privaceracloud
Add secrets to this scope:
peg_username
,peg_password
, andpeg_secret
are literals and should be entered exactly as shown.The
<username>
,<password
>, and<sharedsecret>
values below are the same as what you entered in PrivaceraCloud when adding the PEG service. See API Key to learn more.databricks secrets put --scope privaceracloud --key peg_username --string-value <username> databricks secrets put --scope privaceracloud --key peg_password --string-value <password> databricks secrets put --scope privaceracloud --key peg_secret --string-value <sharedsecret>
Add the following environment variables in your Databricks cluster:
PEG_SECRET={{secrets/privaceracloud/peg_secret}} PEG_PASSWORD={{secrets/privaceracloud/peg_password}} PEG_USERNAME={{secrets/privaceracloud/peg_username}}
Caution
Note that there can be existing environment variables. Do not remove these.

First log into Databricks, create a notebook, and set the language to SQL.
Run the following SQL commands in Databricks to create UDFs for Privacera encryption services, named protect
and unprotect
.
Note
com.privacera.crypto
functions enable use of encryption schemes, but do not accept presentation schemes.
Create Privacera
protect
UDF:create database if not exists privacera; use privacera; drop function if exists privacera.protect; CREATE FUNCTION privacera.protect AS com.privacera.crypto.PrivaceraEncryptUDF'
Create Privacera
unprotect
UDF:use privacera; drop function if exists privacera.unprotect; CREATE FUNCTION privacera.unprotect AS com.privacera.crypto.PrivaceraDecryptUDF'
Configure Privacera resource policies
Databricks SQL resources are managed under Access Manager > Resource Policies > privacera_hive.
To add resource policies to allow access to selected resources:
Create a policy to give data access users, groups, or roles the
select
privilege to target database resources. On the Add Policy page, under Allow Conditions use Select Role, Select Group and/or Select User then under Permissions chooseselect
.For example:
Create a policy to grant data access users, groups, or roles the
select
privilege to the protect and unprotect UDFs. On the Add Policy page, under Allow Conditions use Select Role, Select Group and/or Select User then under Permissions chooseselect
.For example:
How to use UDFs in SQL to encrypt and decrypt
The following are SQL command examples for privacera.protect
(encrypt) and privacera.unprotect
(decrypt) UDFs:
select privacera.protect(<COLNAME>,'<ENCRYPTION_SCHEME_NAME>') from <DB_NAME>.<TABLE_NAME>;
<COLNAME>
is the identifier of the column to encrypt.<ENCRYPTION_SCHEME_NAME>
is the name of the chosen Privacera encryption scheme.<DB_NAME>.<TABLE_NAME>
are the names of the database and table in that database.
Example
In this example, the email
column of the bigdatabase.customer_data
table is encrypted with the SYSTEM_EMAIL
encryption scheme.
select privacera.protect(email, \'SYSTEM\_EMAIL\') from bigdatabase.customer\_data;
select privacera.unprotect(<COLNAME>,'<ENCRYPTION_SCHEME_NAME>') from <DB_NAME>.<TABLE_NAME>;
<COLNAME>
is the identifier of the column to decrypt.<ENCRYPTION_SCHEME_NAME>
is the name of the chosen Privacera encryption scheme, which must be the same encryption scheme used to originally encrypt.<DB_NAME>.<TABLE_NAME>
are the names of the database and table in that database.
Example
In this example, the email
column of the bigdatabase.customer_data
table is decrypted with the SYSTEM_EMAIL
encryption scheme.
select privacera.unprotect(email, 'SYSTEM_EMAIL') from bigdatabase.customer_data;
The unprotect
UDF supports an optional specification of a presentation scheme that further obfuscates the decrypted data.
For an example of data transformation with the optional presentation scheme, see Example of Data Transformation with /unprotect and Presentation Scheme..
Example query:
select id, privacera.unprotect(<COLUMN_NAME>, <ENCRYPTION_SCHEME_NAME>, <PRESENTATION_SCHEME_NAME>) <OPTIONAL_NAME_FOR_COLUMN_TO_WRITE_OBFUSCATED_OUPUT> from <DB_NAME>.<TABLE_NAME>;
<PRESENTATION_SCHEME_NAME>
is the name of the chosen Privacera presentation scheme with which to further obfuscate the decrypted data.<OPTIONAL_NAME_FOR_COLUMN_TO_WRITE_OBFUSCATED_OUTPUT>
is a "pretty" name for the column that the obfuscated data is written to.Other arguments are the same as in the preceding
unprotect
example.
Dremio
This topic describes how to connect a Dremio application to PrivaceraCloud.
Prerequisite
There must be Dremio host where Dremio Enterprise Edition is installed.
Note
Community Edition is not supported
Connect Application
Go the Setting > Applications.
In the Applications screen, select dremio.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Access Management for your application
Click Download Script (to download the privacera_dremio_plugin.sh)
Click Save.
Note
If required download privacera_dremio_plugin.sh again using edit application option.
Configure Privacera plugin
Configure Privacera plugin depending on the installation of Dremio on your instance.
Note
For a new/existing data source configured in Dremio Data Lake, ensure Enable external authorization plugin checkbox under Settings > Advanced Options of the data source is selected in the Dremio UI. Then, restart the Dremio service.
RPM
SSH to your instance where Dremio RPM is installed
Copy the downloaded privacera_dremio_plugin.sh file to the Home folder in your Dremio instance.
Run the following commands:
mkdir -p ~/privacera/install mv privacera_dremio_plugin.sh ~/privacera/install
Launch the privacera_dremio_plugin.sh script.
cd ~/privacera/instal chmod +x privacera_dremio_plugin.sh sudo ./privacera_dremio_plugin.sh
Update dremio envornment to add Privacera jars and configuration in the Dremio classpath.
vi ${DREMIO_HOME}/conf/dremio-env
Update the following variable if it exists or add it.
DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*
Restart Dremio.
sudo service dremio restart
Kubernetes
Depending on your cloud provider, set up Dremio in a Kubernetes environment.
See the following links for deployment:
After setting up Dremio, perform the following steps to deploy the Privacera plugin.
SSH to your instance where Dremio is installed containing the Dremio Kubernetes artifacts and change to the dremio-cloud-tools/charts/dremio_v2/ directory.
Copy the privacera_dremio_plugin.sh downloaded file to the dremio_v2 folder in your Dremio Kubernetes instance.
Run the following commands:
Update configmap.yml to add new configmap for Privacera configuration.
vi templates/dremio-configmap.yaml
Add the following configuration at the start of the file.
apiVersion: v1 kind: ConfigMap metadata: name: dremio-privacera-install data: privacera_dremio_plugin.sh: |- {{ .Files.Get "privacera_config/privacera_dremio_plugin.sh" | nindent 4 }} ---
Update dremio-env to add Privacera jars and configuration in the Dremio classpath.
vi config/dremio-env
Update the following variable if it exists or add it.
DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/* v
Update values.yaml.
vi values.yaml
Add the following configuration for extraInitContainers inside the coordinator section.
extraInitContainers: | - name: install-privacera-dremio-plugin image: {{.Values.image}}:{{.Values.imageTag}} imagePullPolicy: IfNotPresent securityContext: runAsUser: 0 volumeMounts: - name: dremio-privacera-plugin-volume mountPath: /opt/dremio/plugins/authorizer - name: dremio-ext-jars-volume mountPath: /opt/privacera/dremio-ext-jars - name: dremio-privacera-config mountPath: /opt/privacera/conf/ - name: dremio-privacera-install mountPath: /opt/privacera/mount command: - "bash" - "-c" - "cd /opt/privacera/mount/ && cp * /tmp/ && cd /tmp && ./privacera_dremio_plugin.sh"extraInitContainers: | - "cd /opt/privacera/mount/ && cp * /tmp/ && cd /tmp && ./privacera_dremio_plugin.sh"
Update or uncomment the extraVolumes section inside the coordinator section and add the following configuration:
extraVolumes: - name: dremio-privacera-install configMap: name: dremio-privacera-install defaultMode: 0777 - name: dremio-privacera-plugin-volume emptyDir: {} - name: dremio-ext-jars-volume emptyDir: {} - name: dremio-privacera-config emptyDir: {
Update or uncomment the extraVolumeMounts section inside the coordinator section and add the following configuration:
extraVolumeMounts: - name: dremio-ext-jars-volume mountPath: /opt/privacera/dremio-ext-jars - name: dremio-privacera-plugin-volume mountPath: /opt/dremio/plugins/authorizer - name: dremio-privacera-config mountPath: /opt/privacera/conf
Upgrade your Helm release. Get the release name by running helm list command. The text under the Name column is Helm release.
helm upgrade -f values.yaml <release-name> .
DynamoDB
This topic describes how to connect DynamoDB application to PrivaceraCloud.
Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.
Prerequisites in AWS console
The following prerequisites must be met:
Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.
Configure a Trust relationship with PrivaceraCloud See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.
Connect application
Go to Settings > Applications.
On the Applications screen, select DynamoDB.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management with the toggle buttons.
Enable Privacera Access Management
Click the toggle button to enable Privacera Access Management for your application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key.
AWS Secret Key: AWS data repository host account Secret Key
AWS Region: AWS S3 bucket region.
With Use IAM Role enabled:
AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.
AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.
Note
The external ID is stored encrypted. It is never reflected back to the UI or is made visible.
AWS Region: AWS S3 bucket region.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Recommended: Install the AWS CLI.
Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.
Recommended: Validate connectivity by running AWS CLI for DynamoDB such as:
aws dynamodb list-tables
Elastic MapReduce from Amazon
EMR: Hive, PrestoDB, PrestoSQL
This topic describes how to connect an EMR application to PrivaceraCloud.
Note
PrivaceraCloud supports EMR versions 6.x and higher with Kerberos enabled.
Connect application
Go the Settings > Applications.
In the Applications screen, select EMR.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Access Management for your application.
Obtain installation script
In the Edit Application screen, click the Copy URL button to obtain installation script.
Save this value, it will be needed for the
<emr-script-download-url>
later on.EMR clusters can be connected to the PrivaceraCloud in two ways:
Attach PrivaceraCloud authorization in new EMR clusters.
Attach PrivaceraCloud authorization in an existing EMR cluster.
Both methods start with obtaining an account-specific script from your PrivaceraCloud account, followed by adding a startup step to your EMR cluster.
Notice
PrestoDB by default blocks few operations on Hive catalog. This can be enabled by updating
hive.properties
.Click Save.
You can now use PrivaceraCloud to define fine-grained policies and control access to Hive and Presto resources within the EMR cluster.
Configure EMR cluster
From your AWS EMR web console:
Open your AWS EMR cluster, then:
For new EMR clusters , go to Create EMR > Advanced Options and click Go to advanced options.
For existing EMR clusters, locate and the open the existing cluster for configuration update. Open the Steps tab and click Add Step.
In the Add Step dialog, complete the fields as follows:
Step type:
Custom JAR
Name:
Install PrivaceraCloud Plugin
JAR location:
command-runner.jar
Arguments:
bash -c "wget <emr-script-download-url> ; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh"
Action on failure:
Terminate cluster
The EMR Hive plug-in supports view-level access management via the Data_admin feature. By default it supports view-based row-Level filtering and column masking.
This plug-in also supports View-level Access Management using Data_admin feature and View-based Row-Level Filtering and Column Masking features.
By default, the PrestoSQL plug-in on EMR will use policies from
privacera-hive
repository for Access Management.
In PrivaceraCloud, open Access Manager: Audit, and click the Plugin tab. Look for audit items reporting the status "Policies synced to plugin. This indicates that your EMR Hive, Presto, or Spark data resource is connected.
EMR Spark (Fine-Grained Access Control)
These instructions enable Fine-Grained Access Control (FGAC) for an existing connected AWS S3 data resource. FGAC enables policies at the database, table, and column level to be defined in service "privacera_hive" in Access Manager: Resource Policies. Either Object Level Acess Control (OLAC) or Fine-Grained Access Control (FGAC) can be added to an existing AWS S3 configuration but not both.
Once installed and enabled, each data user query is first parsed by Spark and authenticated by PrivaceraCloud Spark Plug-In. The requesting user must have authenticated access to all resources referenced by the query for it to be allowed.
In PrivaceraCloud, obtain your account unique call-in
<emr-script-download-url>
to allow the EMR cluster to obtain additional scripts and setup.Open Settings > API Key.
Use an existing active API Key* or generate a new one.
Caution
Make sure the Expiry column is set to "Never Expires".
Click the i icon to get the scripts.
Under AWS EMR Setup Script, click Copy Url. Save this value. It will be used as the
<emr-script-download-url>
, in the following instructions.
From the AWS EMR web console:
For new EMR clusters, go to Create EMR > Advanced Options and click Go to advanced options.
For existing EMR clusters, locate and the open the existing cluster for configuration update. Open the Steps tab and click Add Step.For new EMR clusters, go to Create EMR > Advanced Options and click Go to advanced options.
Note
To add multiple JWT configurations, see How to configure multiple JSON Web Tokens (JWTs) for EMR
Install the Privacera Spark FGAC Plugin:
In a new cluster: select Configure Step > Custom JAR at the bottom of the configuration page.
For an existing cluster: in Steps, select Custom Jar and click Add Step.
Add the given values in the following fields and click Add.
Name:
Install PrivaceraCloud Spark Plugin
JAR location:
command-runner.jar
Arguments: add the following command:
bash -c "wget <emr-script-download-url> chmod +x ./privacera_emr.sh sudo ./privacera_emr.sh spark-fgac"
Action on failure:
Terminate cluster
(Optional) To specify the custom policy name for hive, spark, or trino services, export the following variable in arguments:
bash -c "export EMR_HIVE_SERVICE_NAME=<hive_repo_name>; export EMR_TRINO_HIVE_SERVICE_NAME=<trino_hive_repo_name>; export EMR_SPARK_HIVE_SERVICE_NAME=<spark_hive_repo_name>; wget <emr-script-download-url> ; chmod +x ./privacera_emr.sh ; sudo -E ./privacera_emr.sh spark-fgac"
where:
hive_repo_name
is a custom hive service name for hive application in EMR.spark_hive_repo_name
is a custom hive service name for spark applications in EMR.trino_hive_repo_name
is a custom hive service name for trino application in EMR.
Notice
The Privacera plugin also supports view-level access control using Data admin, view-based row-Level filtering and column masking features.
EMR Spark (Object Level Access Control)
These instructions enable Object Level Access Control (OLAC) for existing connected AWS S3 resources. If AWS S3 is not already configured, do so by following the instructions here, then follow these additional configuration steps.
Either Object Level Access Control (OLAC) or Fine-Grained Access Control (FGAC) can be added to an existing AWS S3 configuration, but not both.
Two subcomponents are installed:
Privacera Credential Token Service (P-CTS) is installed to the targeted AWS EMR master node. P-CTS is a secure service running on an EMR master node which provides encrypted access tokens to the requesting user. Tokens are encrypted using a shared secret key with the Privacera Cloud Signing Server.
Privacera Signing Agent (P-SA) installed to targeted AWS EMR worker nodes. P-SA redirects Spark S3 requests to the Privacera Cloud Signing Server with a P-CTS access token in the request. P-SA then provides the appropriate signed response to Spark for accessing the S3 data if:
(a) The incoming request has a valid P-CTS token;
and (b) The requesting user has permissions on the S3 resource as defined in the “privacera_s3“ service in Access Manager: Resource Policies.
These steps will:
Create an AWS Kerberos-based Security Configuration.
Establish a shared secret between PrivaceraCloud and the AWS EMR Kerberos based Security Configuration.
Create a new AWS cluster configured to use that Security Configuration. That cluster will link back to the Privacera Signing Agent (P-SA) and Privacera Credential Token Service (P-CTS).
Obtain or determine a character string to serve as a "shared key" between PrivaceraCloud and the AWS EMR cluster. We'll refer to this as
<SHARED_KEY>
in the configuration steps below.Obtain your account unique call-in
<emr-script-download-url>
to allow the EMR cluster to obtain additional scripts and setup from PrivaceraCloud:Open Settings: Api Key.
Use an existing Active Api Key or create a new one. Set Expiry = Never Expires.
Open the Api Key Info box (click the (i) in the key row).
Copy and store as
<emr-script-download-url>
using the Copy Url link found under AWS EMR Setup Script.
In PrivaceraCloud console, Setting: Application, select the existing AWS Data Server application (S3 or Athena), and click the edit (pen) icon.
In the the ADVANCED tab, add the following property:
dataserver.shared.secret=<SHARED_KEY>
Click Save.
Create an EMR Security Configuration for Kerberos Authentication:
Open your AWS EMR web console.
Click Security Configurations, then Create.
Provide a name for this Security Configuration such as
PRIVACERA_KDC
. We'll refer to this same Security Configuration later.Under Authentication, select Enable Kerberos authentication and complete the fields as appropriate for your environment.
Create a new EMR cluster and assign to it the new Security Configuration.
In the AWS EMR Console, create a new cluster.
In Advanced Options, click Go to advanced options.
In the Software Configuration, select the appropriate EMR release and any associated applications.
In Edit Software Settings, select Enter configuration, and add the following properties:
[ { "classification":"spark-defaults", "properties":{ "spark.driver.extraJavaOptions":"-javaagent:/usr/lib/spark/jars/privacera-signing-agent.jar", "spark.executor.extraJavaOptions":"-javaagent:/usr/lib/spark/jars/privacera-signing-agent.jar", } } ]
In Steps, select Custom Jar and click Add Step.
Add code to download and install the Privacera Credential Token Service. Complete the fields as below substituting your
<emr-sript-download-url>
, value in thewget
command below. Click Add when all fields are complete.Name: ``Install Privacera CTS```
JAR location: command-runner.jar
Arguments:bash -c "wget <emr-script-download-url> ; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh priv-cts"
Action on failure: Continue
Click Next.
Configure hardware by selecting values Networking, Node, and Instance values as appropriate for your environment.
Configure general cluster settings by adding two scripts that will Install Privacera Signing Agent on master and worker nodes.
Assign Cluster name, Logging, Debugging, and Termination protection as appropriate for your environment.
Install the Master signing agent:
Go to Additional Options > Bootstrap Actions and select bootstrap action "Run if" and click Configure and add to open the Add Bootstrap Action dialog.
In this dialog set the name to Privacera Signing Agent for Master, copy the following script into Optional Arguments the and click Add when done. Replace
<emr-script-download-url>
with your own value.instance.isMaster=true "wget <emr-script-download-url>; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh spark-fbac"
The Worker signing agent is installed in the same way. Under Additional Options, expand Bootstrap Actions, select bootstrap action "Run if" and click Configure and add to open the Add Bootstrap Action dialog. In this dialog set the name to Privacera Signing Agent for Worker, copy the following script into Optional Arguments . Replace
<emr-script-download-url>
with your own value.instance.isMaster=false "wget <emr-script-download-url>; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh spark-fbac"
Configure security options
Complete Security Options as appropriate for your environment.
Open Security Configuration, and select the configuration you created earlier, e.g. "PRIVACERA_KDC". Then n the following fields, enter values:
Realm
KDC admin password
Click Create cluster to complete.
EMRFS S3
This topic describes how to connect EMRFS S3 application to PrivaceraCloud. You only need to enable Access Management to start controlling access on EMRFS S3.
Connect application
Go the Setting > Applications.
In the Applications screen, select EMRFS S3.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable the Access Management for EMRFS S3.
The message displays, Save the setting to start controlling access on EMRFS S3.
Click Save.
Files
This topic describes how to connect Files to PrivaceraCloud. You only need to enable Access Management to start controlling access on Files.
Connect application
Go the Setting > Applications.
In the Applications screen, select Files.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable the Access Management for Files.
The message displays, Save the setting to start controlling access on Files.
Click Save.
File Explorer for Google Cloud Storage
This topic describes how to connect Google Cloud Storage (GCS) to PrivaceraCloud. You only need to enable Access Management to control access to data on GCS and enable the File Explorer.
Connect application
Go the Setting > Applications.
In the Applications screen, select GCS.
Enter the application Name and Description, and then click Save.
On the BASIC tab, enter the following JSON for the Google Cloud Storage Account Credential.
{ "type": "service_account", "project_id": "MyProjectID", "private_key_id": "c97****b5", "private_key": "-----BEGIN PRIVATE KEY-----\nMII***r\nJA4RFEHkNOwuQ****FM\n-----END PRIVATE KEY-----\n", "client_email": "abc@developer.gserviceaccount.com", "client_id": "1**8372", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/5**7-compute%40developer.gserviceaccount.com" }
To validate the credentials, click Test Connection.
Click Save..
Using File Explorer with GCS
Go to Data Inventory > File Explorer and select your GCS data.
Glue
This topic describes how to connect the Glue application to PrivaceraCloud. You only need to enable Access Management to start controlling access on Glue.
Prerequisites
Connect the S3 application to the PrivaceraCloud before connecting the Glue application.
Connect application
Go the Setting > Applications.
In the Applications screen, select Glue.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Access Management for Glue.
The message displays, Save the setting to start controlling access on Glue.
Click Save.
Enable Privacera Access Management
Click the toggle button to enable Privacera Access Management for your application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key.
AWS Secret Key: AWS data repository host account Secret Key
AWS Region: AWS S3 bucket region.
With Use IAM Role enabled:
AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.
AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.
Note
The external ID is stored encrypted. It is never reflected back to the UI or is made visible.
AWS Region: AWS S3 bucket region.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Recommended: Install the AWS CLI.
Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.
Recommended: Validate connectivity by running AWS CLI for Glue such as:
aws glue get-catalog-import-status
Google BigQuery
This topic describes how to connect a Power BIapplication to PrivaceraCloud.
Connect Application
Go to Settings -> Applications.
On the Applications screen, select Power BI.
Enter the application Name and Description, and then click SAVE.
Click the toggle button to enable Access Management for Power BI.
In the BASIC tab, enter the values in the required(*) fields and click SAVE.
In the ADVANCED tab, you can add custom properties.
Caution
Advanced properties should be modified in consultation with Privacera.
Click the IMPORT PROPERTIES link to browse and import application properties.
Connector Properties
Basic fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
BigQuery project location |
|
| Yes | Specifies the geographical region where the taxonomy for the PolicySync should be created. |
BigQuery project id |
| Yes | Specifies the Google project ID where your Google BigQuery data source resides. For example: | |
Service account email |
| Yes | Specifies the service account email address that PolicySync uses. You must specify this value if you are not using a Google Cloud Platform (GCP) virtual machine attached service account. | |
BigQuery private key content |
| No | Specifies the Google Cloud Platform (GCP) account credential key JSON content. PolicySync uses this data to connect to Google BigQuery. | |
Projects to set access control policies |
| Yes | Specifies a comma-separated list of project names to which access control is managed by PolicySync. If unset, PolicySync manages all projects. If specified, use the following format. You can use wildcards. Names are case-sensitive. The list of projects to ignore takes precedence over any projects specified by this setting. An example list of projects might resemble the following: | |
Native public group identity name |
| Yes | Set this property to your preferred value, policysync uses this native public group for access grants whenever there is policy created referring to public group inside it. The following values are allowed:
| |
Enable audit |
|
| Yes | Specifies whether Privacera fetches access audit data from the data source. |
Advanced fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
Create custom iam roles in gcp |
|
| No | Specifies whether PolicySync automatically creates custom IAM roles in your Google Cloud Platform project or organization for fine-grained access control (FGAC). If set to |
GCP custom iam roles scope |
|
| No | Specifies whether PolicySync creates and uses custom IAM roles at the project or organizational level in Google Cloud Platform (GCP). The following values are allowed:
|
GCP organization id |
| No | Specifies the Google Cloud Platform (GCP) organizational ID. Specify this only if you configured PolicySync to use custom IAM roles at the organizational level. | |
Datasets to set access control policies |
| Yes | Specifies a list of comma-separated datasets that PolicySync manages access control to. You can use wildcards in the value. Names are case-sensitive. If you want to manage all datasets, do not set a value. For example: testproject1.dataset1,testproject2.dataset2,sales_project*.sales* You can configure the postfix by specifying Secure view dataset name postfix. If specified, the Datasets to ignore while setting access control policies setting takes precedence over this setting. | |
Tables to set access control policies |
| No | Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Use the following format when specifying a table: <PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME> If specified, Tables to ignore while setting access control policies takes precedence over this setting. If you specify a wildcard, such as in the following example, all matched tables are managed:
The specified value, if any, is interpreted in the following ways:
| |
Projects to ignore while setting access control policies |
| No | Specifies a comma-separated list of project names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all projects are subject to access control. For example: This setting supersedes any values specified by Projects to set access control policies. | |
Datasets to ignore while setting access control policies |
| No | Specifies a comma-separated list of dataset names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all datasets are subject to access control. For example: This setting supersedes any values specified by Datasets to set access control policies. | |
Tables to ignore while setting access control policies |
| No | Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all tables are subject to access control. Specify tables using the following format: <PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME> This setting supersedes any values specified by Tables to set access control policies. | |
Users to set access control policies |
| No | Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive. If not specified, PolicySync manages access control for all users. If specified, Users to be ignored by access control policies takes precedence over this setting. An example user list might resemble the following: | |
Groups to set access control policies |
| No | Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of projects might resemble the following: If specified, Groups to be ignored by access control policies takes precedence over this setting. | |
Users to be ignored by access control policies |
| No | Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control. This setting supersedes any values specified by Users to set access control policies. | |
Groups to be ignored by access control policies |
| No | Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control. This setting supersedes any values specified by Groups to set access control policies. | |
Set access control policies only on the users from managed groups |
|
| No | Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false. |
Enforce bigquery native row filter |
|
| No | Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views. |
Enforce masking policies using secure views |
|
| No | Specifies whether to use secure view based masking. The default value is |
Enforce row filter policies using secure views |
|
| No | Specifies whether to use secure view based row filtering. The default value is While Google BigQuery supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended. |
Create secure view for all tables/views |
|
| No | Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled. |
Default masking value for numeric datatype |
|
| No | Specifies the masking value used for numeric data types. |
Default masking value for text/string datatype |
|
| No | Specifies the masking value used for text or string data types. |
Secure view name prefix |
| No | Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is | |
Secure view name postfix |
| No | Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is | |
Secure view dataset name prefix |
| No | Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is | |
Secure view dataset name postfix |
|
| No | Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is |
Enable this for policy enforcements and user/group/role management. |
|
| Yes | Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is |
Enable to use data admin functionality. |
|
| No | This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view. |
ignore audit for users |
| No | Specifies a comma separated list of users to exclude when fetching access audits. For example: | |
project id used to fetch BigQuery audits |
| No | Specifies the project ID where Google BigQuery stores audit log data. | |
dataset used to fetch BigQuery audits |
| No | Specifies the name of the dataset where Google BigQuery logs audit data. Privacera uses this data for running audit queries. |
Custom fields
Canonical name | Type | Default | Description |
---|---|---|---|
|
|
| Specifies whether the PolicySync uses the service account attached to your virtual machine for the credentials to connect to the data source. |
|
| Specifies a list of mappings between PolicySync custom IAM role names and your custom role names. Use the following format when specifying your custom role names: <PRIVACERA_DEFAULT_ROLE_NAME_1>:<CUSTOM_ROLE_NAME_1> <PRIVACERA_DEFAULT_ROLE_NAME_2>:<CUSTOM_ROLE_NAME_2> The following is a list of the default custom role names:
| |
|
|
| Specifies how PolicySync loads resources from Google BigQuery. The following values are allowed:
|
|
|
| Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly. |
|
|
| Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera. |
|
|
| Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies a string to replace the characters matched by the regex specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies a string to replace the characters matched by the regex specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies how PolicySync manages column-level access control. The following values are allowed:
|
|
|
| Specifies a string to use as part of the name of native row filter and masking policies. |
|
|
| Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table proj_priv_ds_priv_data_<ROW_FILTER_ITEM_NUMBER> |
|
|
| Specifies the name of the dataset where PolicySync creates custom masking functions. |
|
| Specifies a suffix to remove from a table or view name. For example, if the table is named You can specify a single suffix or a comma separated list of suffixes. | |
|
| Specifies a suffix to remove from a secure view dataset name. For example, if the dataset is named You can specify a single suffix or a comma separated list of suffixes, such as | |
|
|
| Specifies the interval at which the authorized view ACLs updater thread updates the permissions in the dataset if any permission updates are pending. |
|
|
| Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is |
|
|
| Specifies whether PolicySync applies grants and revokes in batches. If enabled, this behavior improves overall performance of applying permission changes. |
|
|
| Specifies the maximum interval, in minutes, of the time window that SQL queries use to retrieve access audit information. If there are a large number of audits records, narrowing the window interval improves performance. For example, if the interval is set to SELECT * FROM audits where time_from=00:01 and time_to=00:30; SELECT * FROM audits where time_from=00:31 and time_to=01:00; SELECT * FROM audits where time_from=01:01 and time_to=01:30; |
Kinesis
This topic describes how to connect Kinesis application to PrivaceraCloud.
Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.
Prerequisites
Connect the S3 application to the PrivaceraCloud before connecting the Kinesis application.
Connect application
Go to Settings > Applications.
On the Applications screen, select Kinesis.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management with the toggle buttons.
Enable Privacera Access Management
Click the toggle button to enable Privacera Access Management for your application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key.
AWS Secret Key: AWS data repository host account Secret Key
AWS Region: AWS S3 bucket region.
With Use IAM Role enabled:
AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.
AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.
Note
The external ID is stored encrypted. It is never reflected back to the UI or is made visible.
AWS Region: AWS S3 bucket region.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Recommended: Install the AWS CLI.
Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.
Recommended: Validate connectivity by running AWS CLI for Kinesis such as:
aws kinesis list-streams
Lambda
This topic describes how to connect Lambda application to PrivaceraCloud.
Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.
Prerequisites in AWS console
The following prerequisites must be met:
Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.
Configure a Trust relationship with PrivaceraCloud See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.
Connect application
Go to Settings > Applications.
On the Applications screen, select Lambda.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management with the toggle buttons.
Enable Privacera Access Management
Click the toggle button to enable Privacera Access Management for your application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key.
AWS Secret Key: AWS data repository host account Secret Key
AWS Region: AWS S3 bucket region.
With Use IAM Role enabled:
AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.
AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.
Note
The external ID is stored encrypted. It is never reflected back to the UI or is made visible.
AWS Region: AWS S3 bucket region.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Recommended: Install the AWS CLI.
Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud Data Server proxy.
Recommended: Validate connectivity by running AWS CLI for Lambda such as:
aws lambda list-functions
Microsoft SQL Server
This topic describes how to connect Microsoft SQL (MSSQL) application to PrivaceraCloud.
Connect application
Go the Setting > Applications.
In the Applications screen, select MS SQL.
Enter the application Name and Description, and then click Save.
You can see Access Management and Data Discovery with toggle buttons.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see Discovery.
Click the toggle button to enable Access Management for MS SQL.
In the BASIC tab, enter the values in the give fields and click Save. For property details and description, see table below:
Note
Make sure that the other properties are advanced and should be modified in consultation with Privacera.
Basic fields
Table 11. Basic fieldsField name
Type
Default
Required
Description
MSSQL JDBC URL
string
Yes
Specifies the JDBC URL for the Microsoft SQL Server connector.
Use the following format for the JDBC string:
jdbc:sqlserver://<JDBC_SQLSERVER_URL_WITH_PORT_NUMBER>
MSSQL jdbc username
string
Yes
Specifies the JDBC username to use.
MSSQL jdbc password
string
Yes
Specifies the JDBC password to use.
MSSQL master database
string
master
Yes
Specifies the name of the JDBC master database that PolicySync establishes an initial connection to.
MSSQL authentication type for the database engine
string
SqlPassword
Yes
Specifies the authentication type for the database engine. The following types are supported:
If the user specified by MSSQL jdbc username is a local user, specify:
SqlPassword
If the user specified by MSSQL jdbc username is a Microsoft Azure Active Directory user, specify:
ActiveDirectoryPassword
Default password for new mssql user
string
Yes
Specifies the password to use when PolicySync creates new users.
MSSQL resource owner
string
No
Specifies the role that owns the resources managed by PolicySync.
If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.
If a value is specified, the owner of the resource will be changed to the specified value.
The following resource types are supported:
Database
Schemas
Tables
Views
Enable policy enforcements and user/group/role management
boolean
true
Yes
Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is
true
.Enable access audits
boolean
false
Yes
Specifies whether Privacera fetches access audit data from the data source.
If specified, you must specify a value for the MSSQL Audits storage URL setting.
MSSQL Audits storage URL
string
No
Specifies the URL for the audit logs provided by the Azure SQL Auditing service. For example:
https://test.blob.core.windows.net/sqldbauditlogs/test
Advanced fields
Table 12. Advanced fieldsField name
Type
Default
Required
Description
Databases to set access control policies
string
No
Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards.
An example list of databases might resemble the following:
testdb1,testdb2,sales db*
.If specified, Databases to ignore while setting access control policies takes precedence over this setting.
Schemas to set access control policies
string
No
Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards.
Use the following format when specifying a schema:
<DATABASE_NAME>.<SCHEMA_NAME>
If specified, Schemas to ignore while setting access control policies takes precedence over this setting.
If you specify a wildcard, such as in the following example, all schemas are managed:
<DATABASE_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all schemas.
If set to
none
no schemas are managed.
Tables to set access control policies
string
No
Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards.
Use the following format when specifying a table:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
If specified,
ignore.table.list
takes precedence over this setting.If you specify a wildcard, such as in the following example, all matched tables are managed:
<DATABASE_NAME>.<SCHEMA_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all tables.
If set to
none
no tables are managed.
Databases to ignore while setting access control policies
string
No
Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all databases are subject to access control.
For example:
testdb1,testdb2,sales_db*
This setting supersedes any values specified by Databases to set access control policies.
Schemas to ignore while setting access control policies
string
No
Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all schemas are subject to access control.
For example:
testdb1.schema1,testdb2.schema2,sales_db*.sales*
This setting supersedes any values specified by Schemas to set access control policies.
Regex to find special characters in user names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in user names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in group names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in group names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in role names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in role names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.
If not specified, no find and replace operation is performed.
Persist case sensitivity of user names
boolean
false
No
Specifies whether PolicySync converts user names to lowercase when creating local users. If set to
true
, case sensitivity is preserved.Persist case sensitivity of group names
boolean
false
No
Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to
true
, case sensitivity is preserved.Persist case sensitivity of role names
boolean
false
No
Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to
true
, case sensitivity is preserved.Manage user from portal
boolean
false
No
Specifies whether PolicySync maintains user membership in roles in the Microsoft SQL Server data source.
Manage group from portal
boolean
false
No
Specifies whether PolicySync creates groups from Privacera in the Microsoft SQL Server data source.
Manage role from portal
boolean
false
No
Specifies whether PolicySync creates roles from Privacera in the Microsoft SQL Server data source.
Users to set access control policies
string
No
Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards.
If not specified, PolicySync manages access control for all users.
If specified, Users to be ignored by access control policies takes precedence over this setting.
An example user list might resemble the following:
user1,user2,dev_user*
.Groups to set access control policies
string
No
Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards.
An example list of projects might resemble the following:
group1,group2,dev_group*
.If specified, Groups be ignored by access control policies takes precedence over this setting.
Roles to set access control policies
string
No
Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards.
An example list of projects might resemble the following:
role1,role2,dev_role*
.If specified, Roles be ignored by access control policies takes precedence over this setting.
Users to be ignored by access control policies
string
No
Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all users are subject to access control.
This setting supersedes any values specified by Users to set access control policies.
Groups be ignored by access control policies
string
No
Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all groups are subject to access control.
This setting supersedes any values specified by Groups to set access control policies.
Roles be ignored by access control policies
string
No
Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all roles are subject to access control.
This setting supersedes any values specified by Roles to set access control policies.
Prefix of mssql roles for portal users
string
priv_user_
No
Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named
<USER>
defined in Privacera and the role prefix ispriv_user_
, the local role is namedpriv_user_<USER>
.Prefix of postgres roles for portal group
string
priv_group_
No
Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named
etl_users
defined in Privacera and the role prefix isprefix_
, the local role is namedprefix_etl_users
.Prefix of postgres roles for portal role
string
priv_role_
No
Specifies the prefix that PolicySync uses when creating roles from Privacera in the Microsoft SQL Server data source.
For example, if you have a role in Privacera named
finance
defined in Privacera and the role prefix isrole_prefix_
, the local role is namedrole_prefix_finance
.Use mssql native public group for public group access policies
boolean
false
No
Specifies whether PolicySync uses the Microsoft SQL Server native public group for access grants whenever a policy refers to a public group. The default value is false.
Set access control policies only on the users from managed groups
boolean
false
No
Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.
Set access control policies only on the users/groups from managed roles
boolean
false
No
Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.
Enforce MSSQL native row filter
boolean
true
No
Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.
Enforce masking policies using secure views
boolean
true
No
Specifies whether to use secure view based masking. The default value is
true
.Enforce row filter policies using secure views
boolean
false
No
Specifies whether to use secure view based row filtering. The default value is
false
.While Microsoft SQL Server supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.
Create secure view for all tables/views
boolean
false
No
Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.
Default masked value for numeric datatype columns
integer
0
No
Specifies the default masking value for numeric column types.
Default masked value for text/varchar datatype columns
string
<MASKED>
No
Specifies the default masking value for text and string column types.
Secure view name prefix
string
No
Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view name for a table namedexample1
isdev_example1
.Secure view name postfix
string
_secure
No
Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a table namedexample1
isexample1_dev
.Secure view schema name prefix
string
No
Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view schema name for a schema namedexample1
isdev_example1
.Secure view schema name postfix
string
No
Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a schema namedexample1
isexample1_dev
.Enable dataadmin
boolean
true
No
This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.
Custom fields
Table 13. Custom fieldsCanonical name
Type
Default
Description
load.resources
string
load_from_database_columns
Specifies how PolicySync loads resources from Microsoft SQL Server. The following values are allowed:
load_md
: Load resources from Microsoft SQL Server with a top-down resources approach, that is, it first loads the project and then the database followed by tables and its columns.load_from_database_columns
: Load resources one by one for each resource type that is, it loads all databases first, then it loads all schemas in all databases, followed by all tables in all schemas and its columns. This mode is recommended since it is faster than the load mode.
sync.interval.sec
integer
60
Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.
sync.serviceuser.interval.sec
integer
420
Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.
sync.servicepolicy.interval.sec
integer
540
Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.
audit.interval.sec
integer
30
Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.
ignore.table.list
string
Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Specify tables using the following format:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
This setting supersedes any values specified by Tables to set access control policies.
user.name.case.conversion
string
lower
Specifies how user name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of user names is set to
true
.group.name.case.conversion
string
lower
Specifies how group name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of group names is set to
true
.role.name.case.conversion
string
lower
Specifies how role name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of role names is set to
true
.user.filter.with.email
string
Set this property to true if you only want to manage users who have an email address associated with them in the portal.
masked.date.value
string
null
Specifies the default masking value for date column types.
secure.view.name.remove.suffix.list
string
Specifies a suffix to remove from a table or view name. For example, if the table is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
secure.view.schema.name.remove.suffix.list
string
Specifies a suffix to remove from a schema name. For example, if a schema is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
perform.grant.updates.max.retry.attempts
integer
2
Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is
2
.audit.initial.pull.min
integer
30
Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Microsoft SQL Server.
load.audits
string
load
Specifies the method that PolicySync uses to load access audit information.
The following values are valid:
load
: Use SQL queries
load.users
string
load
Specifies how PolicySync loads users from Microsoft SQL Server. The following values are valid:
load
load_db
external.user.as.internal
boolean
false
Specifies whether PolicySync creates local users for external users.
manage.group.policy.only
boolean
false
Specifies whether access policies apply to only groups. If enabled, any policies that apply to users or roles are ignored.
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the toggle button to enable the Data Discovery for your application.
In the BASIC tab, enter values in the following fields.
JDBC URL
JDBC Username
JDBC Password
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Add data source
To add a resources using this connection as Discovery targets, see Discovery Scan Topics.
MySQL for Discovery
This topic describes how to connect a MySQL application to the PrivaceraCloud Discovery service.
Prerequisites
Before connecting the MySQL application, make sure you have the following information available:
JDBC URL
JDBC Username
JDBC Password
Connect application
Go the Setting > Applications.
In the Applications screen, select MySQL.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Data Discovery for MySQL.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.
In the BASIC tab, enter the values in the following fields:
JDBC URL
JDBC Username
JDBC Password
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Add data source
To add a resources using this connection as Discovery targets, see Discovery Scan Topics.
Open Source Spark
You first obtain an account-specific script from your PrivaceraCloud account, followed by adding a startup step to open source Spark.
Three configurations are available depending on your requirement. Fine-Grained Access Control [FGAC] and Object-Level Access Control [OLAC] are supported in each of the configurations:
Obtain installation script
Obtain the account unique <privacera-plugin-script-download-url>
. This script and other commands run in your Spark command shell to complete the PrivaceraCloud installation.
Steps:
Go to Settings > API Key.
Use an existing active API Key or generate a new one.
Note
Make sure the Expiry column is set to Never Expires.
Click the i icon to get the scripts.
On the Plugins Setup Script, click the COPY URL button. Save this value on your Spark server. It is needed as the
<privacera-plugin-script-download-url>
in the next step.
Configure Privacera Plugin on local/virtual machine
OLAC is supported only with JWT token authentication.
See Data access methods.
Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.
privacera.jwt.oauth.enable=true privacera.jwt.0.token.issuer=<PLEASE_CHANGE> privacera.jwt.0.token.subject=<PLEASE_CHANGE> privacera.jwt.0.token.secret=<PLEASE_CHANGE> privacera.jwt.0.token.publickey=<PLEASE_CHANGE> privacera.jwt.0.token.userKey=<PLEASE_CHANGE> privacera.jwt.0.token.groupKey=<PLEASE_CHANGE> privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
Property
Description
Example
privacera.jwt.oauth.enable
Property to enable JWT auth in Privacera services.
true
privacera.jwt.{index}.token.issuer
Property to enter the URL of the identity provider.
https://you-idp-domain.com
privacera.jwt.{index}.token.publickey
The JWT token public key in String format (Need to delete all newlines).
-----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----
privacera.jwt.{index}.token.secret
[Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.
privacera-api
privacera.jwt.{index}.token.subject
[Optional] Add this If JWT Token has a subject.
api-token
privacera.jwt.{index}.token.userKey
Property to define a unique userKey whose value will be used in user for Ranger policies.
client-id
privacera.jwt.{index}.token.groupKey
Property to define a unique groupKey whose value will be used in group for Ranger policies.
scope
privacera.jwt.{index}.token.parser.type
JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.
PING_IDENTITY: When groupKey is an array
KEYCLOAK: When groupKey is space separator
KEYCLOAK
After adding the properties, run the Dataserver, and then proceed to the next step.
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark" SPARK_PLUGIN_TYPE="OLAC" SPARK_HOME="<PLEASE_CHANGE>" SPARK_CLUSTER_NAME="privacera-spark"
Property
Description
PLUGIN_TYPE
Type of Privacera Plugin which you want to install.
SPARK_PLUGIN_TYPE
Spark Plugin type OLAC. JWT Authentication will be enabled by default.
SPARK_HOME
This is the home directory of your Spark installation. For example, the directory path can be
/home/user/spark
.SPARK_CLUSTER_NAME
Cluster Name which will show up in the Privacera Ranger Audits page.
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will set up the Privacera Plugin in the OLAC mode.
FGAC is recommended to be used with JWT authentication enabled.
Note
If JWT authentication is disabled, access control will fall on the system user or proxy user.
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark" SPARK_PLUGIN_TYPE="FGAC" SPARK_HOME="<PLEASE_CHANGE>" SPARK_CLUSTER_NAME="privacera-spark"
Property
Description
PLUGIN_TYPE
Type of Privacera Plugin which you want to install.
SPARK_PLUGIN_TYPE
Spark Plugin type FGAC.
SPARK_HOME
This is the home directory of your Spark installation. For example, the directory path can be
/home/user/spark
.SPARK_CLUSTER_NAME
Cluster Name which will show up in the Privacera Ranger Audits page.
Add the following properties when JWT auth is enabled:
JWT_OAUTH_ENABLE="true" JWT_ISSUER="<PLEASE_CHANGE>" JWT_PUBLIC_KEY="<PLEASE_CHANGE>" #JWT_SECRET="<PLEASE_CHANGE>" #JWT_SUBJECT="<PLEASE_CHANGE>" JWT_USERKEY="<PLEASE_CHANGE>" JWT_GROUPKEY="<PLEASE_CHANGE>" JWT_PARSER_TYPE="<PLEASE_CHANGE>"
Note
To configure multiple JWTs, refer to FGAC with multiple JWT configurations below.
Property
Description
Example
JWT_OAUTH_ENABLE
To enable JWT authentication.
JWT_OAUTH_ENABLE="true"
JWT_ISSUER
The URL of the identity provider.
JWT_ISSUER="https://your-idp-domain.com"
JWT_PUBLIC_KEY
The JWT token public key in String format.
JWT_SECRET
Uncomment and add value if the JWT token has been encrypted using secret.
JWT_SECRET="privacera-secret"
JWT_SUBJECT
Uncomment and add value if JWT Token has a subject.
JWT_SUBJECT="api-token"
JWT_USERKEY
Property to define a unique userKey whose value will be used in user for Ranger policies.
JWT_USERKEY="client_id"
JWT_GROUPKEY
Property to define a unique groupKey whose value will be used in group for Ranger policies.
JWT_GROUPKEY="scope"
JWT_PARSER_TYPE
JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.
JWT_PARSER_TYPE="KEYCLOAK"
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will set up the Privacera Plugin in the FGAC mode.
FGAC with multiple JWT configurations
To configure multiple JWT configurations add the below index based properties in the privacera_env.sh
file. In which {index} start from 0 to n.
JWT_OAUTH_ENABLE="true" JWT_{index}_ISSUER="<PLEASE_CHANGE>" JWT_{index}_PUBLICKEY="<PLEASE_CHANGE>" JWT_{index}_SUBJECT="<PLEASE_CHANGE>" JWT_{index}_SECRET="<PLEASE_CHANGE>" JWT_{index}_USERKEY="<PLEASE_CHANGE>" JWT_{index}_GROUPKEY="<PLEASE_CHANGE>" JWT_{index}_PARSER_TYPE="<PLEASE_CHANGE>"
For example, for two configurations: (starts at 0)
JWT_OAUTH_ENABLE="true" JWT_0_ISSUER="https://mydomain.com/issuer" JWT_0_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----" JWT_0_SUBJECT=”principal1” JWT_0_SECRET=”shkl-XXXX-XXXX-XXXX” JWT_0_USERKEY="client_id" JWT_0_GROUPKEY="scope" JWT_0_PARSER_TYPE="PING_IDENTITY" JWT_1_ISSUER="https://mydomain.com/issuer" JWT_1_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----" JWT_1_SUBJECT=”principal2” JWT_1_SECRET=”suhjk-XXXX-XXXX-XXXX” JWT_1_USERKEY="client_id" JWT_1_GROUPKEY="scope" JWT_1_PARSER_TYPE="KEYCLOAK"
Configure Privacera Plugin in an Existing Docker File
If you have an existing Open Source Spark setup running on Kubernetes, you can update your existing Docker file used to create Spark image to add steps for installing Privacera Plugin.
OLAC is supported only with JWT token authentication.
Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist.
See Data access methods.
Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.
privacera.jwt.oauth.enable=true privacera.jwt.0.token.issuer=<PLEASE_CHANGE> privacera.jwt.0.token.subject=<PLEASE_CHANGE> privacera.jwt.0.token.secret=<PLEASE_CHANGE> privacera.jwt.0.token.publickey=<PLEASE_CHANGE> privacera.jwt.0.token.userKey=<PLEASE_CHANGE> privacera.jwt.0.token.groupKey=<PLEASE_CHANGE> privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
Property
Description
Example
privacera.jwt.oauth.enable
Property to enable JWT auth in Privacera services.
true
privacera.jwt.{index}.token.issuer
Property to enter the URL of the identity provider.
https://you-idp-domain.com
privacera.jwt.{index}.token.publickey
The JWT token public key in String format (Need to delete all newlines).
-----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----
privacera.jwt.{index}.token.secret
[Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.
privacera-api
privacera.jwt.{index}.token.subject
[Optional] Add this If JWT Token has a subject.
api-token
privacera.jwt.{index}.token.userKey
Property to define a unique userKey whose value will be used in user for Ranger policies.
client-id
privacera.jwt.{index}.token.groupKey
Property to define a unique groupKey whose value will be used in group for Ranger policies.
scope
privacera.jwt.{index}.token.parser.type
JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.
PING_IDENTITY: When groupKey is an array
KEYCLOAK: When groupKey is space separator
KEYCLOAK
After adding the properties, run the Dataserver, and then proceed to the next step.
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
Copy the following to your Docker file. Set the
PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL
property.######## Install Privacera Spark Plugin Start ########### # ENV SPARK_HOME /opt/apache/spark RUN apt-get -y install zip unzip wget ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>" ENV PLUGIN_TYPE="spark" ENV SPARK_PLUGIN_TYPE="OLAC" ENV SPARK_CLUSTER_NAME="privacera-spark" RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL" RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh RUN chmod +x privacera_plugin.sh RUN ./privacera_plugin.sh ######## Install Privacera Spark Plugin End ###########
Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.
FGAC is recommended to be used with JWT authentication enabled.
Note
If JWT authentication is disabled, access control will fall on the system user or proxy user.
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
Copy the following to your Docker file. Set the
PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL
property. For the JWT properties, refer the table below.######## Install Privacera Spark Plugin Start ########### # ENV SPARK_HOME /opt/apache/spark RUN apt-get -y install zip unzip wget ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>" ENV PLUGIN_TYPE="spark" ENV SPARK_PLUGIN_TYPE="FGAC" ENV SPARK_CLUSTER_NAME="privacera-spark" ENV JWT_OAUTH_ENABLE "true" ENV JWT_ISSUER=<PLEASE_CHANGE> ENV JWT_PUBLIC_KEY=<PLEASE_CHANGE> ENV JWT_SECRET=<PLEASE_CHANGE> ENV JWT_SUBJECT=<PLEASE_CHANGE> ENV JWT_USERKEY=<PLEASE_CHANGE> ENV JWT_GROUPKEY=<PLEASE_CHANGE> ENV JWT_PARSER_TYPE=<PLEASE_CHANGE> RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL" RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh RUN chmod +x privacera_plugin.sh RUN ./privacera_plugin.sh ######## Install Privacera Spark Plugin End ###########
Note
To configure multiple JWTs, refer to FGAC with Multiple JWT Configuration in an Existing Docker File below.
Property
Description
Example
JWT_OAUTH_ENABLE
To enable JWT authentication.
JWT_OAUTH_ENABLE="true"
JWT_ISSUER
The URL of the identity provider.
JWT_ISSUER="https://your-idp-domain.com"
JWT_PUBLIC_KEY
The JWT token public key in String format.
JWT_SECRET
Uncomment and add value if the JWT token has been encrypted using secret.
JWT_SECRET="privacera-secret"
JWT_SUBJECT
Uncomment and add value if JWT Token has a subject.
JWT_SUBJECT="api-token"
JWT_USERKEY
Property to define a unique userKey whose value will be used in user for Ranger policies.
JWT_USERKEY="client_id"
JWT_GROUPKEY
Property to define a unique groupKey whose value will be used in group for Ranger policies.
JWT_GROUPKEY="scope"
JWT_PARSER_TYPE
JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.
JWT_PARSER_TYPE="KEYCLOAK"
Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.
FGAC with Multiple JWT Configuration in an Existing Docker File
To configure multiple JWT configurations add the below index based Environment variable in the Docker file. In which {index} start from 0 to n.
ENV JWT_OAUTH_ENABLE "true" ENV JWT_{index}_ISSUER="<PLEASE_CHANGE>" ENV JWT_{index}_PUBLICKEY="<PLEASE_CHANGE>" ENV JWT_{index}_SUBJECT="<PLEASE_CHANGE>" ENV JWT_{index}_SECRET="<PLEASE_CHANGE>" ENV JWT_{index}_USERKEY="<PLEASE_CHANGE>" ENV JWT_{index}_GROUPKEY="<PLEASE_CHANGE>" ENV JWT_{index}_PARSER_TYPE="<PLEASE_CHANGE>"
For example, for two configurations: (starts at 0)
######## Install Privacera Spark Plugin Start ############ ENV SPARK_HOME /opt/apache/spark RUN apt-get -y install zip unzip wget ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>" ENV PLUGIN_TYPE="spark" ENV SPARK_PLUGIN_TYPE="FGAC" ENV SPARK_CLUSTER_NAME="privacera-spark" ENV JWT_OAUTH_ENABLE "true" ENV JWT_0_ISSUER="https://mydomain.com/issuer" ENV JWT_0_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----" ENV JWT_0_SUBJECT=”principal1” ENV JWT_0_SECRET=”shkl-XXXX-XXXX-XXXX” ENV JWT_0_USERKEY="client_id" ENV JWT_0_GROUPKEY="scope" ENV JWT_0_PARSER_TYPE="PING_IDENTITY" ENV JWT_1_ISSUER="https://mydomain.com/issuer" ENV JWT_1_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----" ENV JWT_1_SUBJECT=”principal2” ENV JWT_1_SECRET=”suhjk-XXXX-XXXX-XXXX” ENV JWT_1_USERKEY="client_id" ENV JWT_1_GROUPKEY="scope" ENV JWT_1_PARSER_TYPE="KEYCLOAK"
Configure Privacera Plugin using Privacera Scripts
The scripts will help you create an Open Source Spark image with Privacera Plugin and push it to the specified Docker Hub which can be used to run Spark with Privacera.
OLAC is supported only with JWT token authentication.
Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist.
See Data access methods.
Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.
privacera.jwt.oauth.enable=true privacera.jwt.0.token.issuer=<PLEASE_CHANGE> privacera.jwt.0.token.subject=<PLEASE_CHANGE> privacera.jwt.0.token.secret=<PLEASE_CHANGE> privacera.jwt.0.token.publickey=<PLEASE_CHANGE> privacera.jwt.0.token.userKey=<PLEASE_CHANGE> privacera.jwt.0.token.groupKey=<PLEASE_CHANGE> privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
Property
Description
Example
privacera.jwt.oauth.enable
Property to enable JWT auth in Privacera services.
true
privacera.jwt.{index}.token.issuer
Property to enter the URL of the identity provider.
https://you-idp-domain.com
privacera.jwt.{index}.token.publickey
The JWT token public key in String format (Need to delete all newlines).
-----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----
privacera.jwt.{index}.token.secret
[Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.
privacera-api
privacera.jwt.{index}.token.subject
[Optional] Add this If JWT Token has a subject.
api-token
privacera.jwt.{index}.token.userKey
Property to define a unique userKey whose value will be used in user for Ranger policies.
client-id
privacera.jwt.{index}.token.groupKey
Property to define a unique groupKey whose value will be used in group for Ranger policies.
scope
privacera.jwt.{index}.token.parser.type
JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.
PING_IDENTITY: When groupKey is an array
KEYCLOAK: When groupKey is space separator
privacera.jwt.token.parser.type=KEYCLOAK
After adding the properties, run the Dataserver, and then proceed to the next step.
SSH to the instance where you want to install Privacera Plugin.
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark_k8s" SPARK_PLUGIN_TYPE="OLAC" HUB="<PLEASE_CHANGE>" HUB_USERNAME="<PLEASE_CHANGE>" HUB_PASSWORD="<PLEASE_CHANGE>" ENV_TAG="<PLEASE_CHANGE>"
Property
Description
PLUGIN_TYPE
Type of Privacera Plugin which you want to install.
SPARK_PLUGIN_TYPE
Spark Plugin type OLAC. JWT Authentication will be enabled by default.
HUB
The Docker hub URL where you want the image to be pushed.
HUB_USERNAME
Docker hub username.
HUB_PASSWORD
Docker hub password.
ENV_TAG
Docker image tag.
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.
FGAC is recommended to be used with JWT authentication enabled.
Note
If JWT authentication is disabled, access control will fall on the system user or proxy user.
SSH to the instance where you want to install Privacera Plugin.
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark_k8s" SPARK_PLUGIN_TYPE="FGAC" SPARK_HOME="<PLEASE_CHANGE>" SPARK_CLUSTER_NAME="privacera-spark"
Property
Description
PLUGIN_TYPE
Type of Privacera Plugin which you want to install.
SPARK_PLUGIN_TYPE
Spark Plugin type FGAC.
SPARK_HOME
This is the home directory of your Spark installation. For example, the directory path can be
/home/user/spark
.SPARK_CLUSTER_NAME
Cluster Name which will show up in the Privacera Ranger Audits page.
Add the following properties when JWT auth is enabled:
JWT_OAUTH_ENABLE="true" JWT_ISSUER="<PLEASE_CHANGE>" JWT_PUBLIC_KEY="<PLEASE_CHANGE>" #JWT_SECRET="<PLEASE_CHANGE>" #JWT_SUBJECT="<PLEASE_CHANGE>" JWT_USERKEY="<PLEASE_CHANGE>" JWT_GROUPKEY="<PLEASE_CHANGE>" JWT_PARSER_TYPE="<PLEASE_CHANGE>"
Property
Description
Example
JWT_OAUTH_ENABLE
To enable JWT authentication.
JWT_OAUTH_ENABLE="true"
JWT_ISSUER
The URL of the identity provider.
JWT_ISSUER="https://your-idp-domain.com"
JWT_PUBLIC_KEY
The JWT token public key in String format.
JWT_SECRET
Uncomment and add value if the JWT token has been encrypted using secret.
JWT_SECRET="privacera-secret"
JWT_SUBJECT
Uncomment and add value if JWT Token has a subject.
JWT_SUBJECT="api-token"
JWT_USERKEY
Property to define a unique userKey whose value will be used in user for Ranger policies.
JWT_USERKEY="client_id"
JWT_GROUPKEY
Property to define a unique groupKey whose value will be used in group for Ranger policies.
JWT_GROUPKEY="scope"
JWT_PARSER_TYPE
JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.
JWT_PARSER_TYPE="KEYCLOAK"
Add the following Docker Hub properties:
HUB="<PLEASE_CHANGE>" HUB_USERNAME="<PLEASE_CHANGE>" HUB_PASSWORD="<PLEASE_CHANGE>" ENV_TAG="<PLEASE_CHANGE>"
Property
Description
HUB
The Docker hub URL where you want the image to be pushed.
HUB_USERNAME
Docker hub username.
HUB_PASSWORD
Docker hub password.
ENV_TAG
Docker image tag.
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.
Deploy Spark on EKS Cluster
SSH to the instance where you want to deploy Spark on the EKS cluster.
Get the Privacera Plugin download URL and set it in the following property. See Obtain installation script.
export PRIVACERA_DOWNLOAD_URL="<PLEASE_CHANGE>"
Create
spark-k8s-artifacts
folder.mkdir ~/privacera/spark-k8s-artifacts cd ~/privacera/spark-k8s-artifacts
Download and extract packages.
wget ${PRIVACERA_DOWNLOAD_URL}/plugin/spark/k8s-spark-deploy.tar.gz -O k8s-spark-deploy.tar.gz tar xzf k8s-spark-deploy.tar.gz rm -r k8s-spark-deploy.tar.gz cd k8s-spark-deploy/
Open
penv.sh
file and substitute the values of the following properties. Refer to the table below:Property
Description
Example
SPARK_NAME_SPACE
Kubernetes namespace
privacera-spark-plugin-test
SPARK_PLUGIN_IMAGE
Docker image with hub
${HUB}/privacera-spark-plugin:${ENV_TAG}
SPARK_DOCKER_PULL_SECRET
Secret for docker-registry
spark-plugin-docker-hub
SPARK_PLUGIN_ROLE_BINDING
Spark role Binding
privacera-sa-spark-plugin-role-binding
SPARK_PLUGIN_SERVICE_ACCOUNT
Spark services account
privacera-sa-spark-plugin
SPARK_PLUGN_ROLE
Spark services account role
privacera-sa-spark-plugin-role
SPARK_PLUGIN_APP_NAME
Spark plugin application name
privacera-spark-examples
Run the following command to replace the property values in EKS deployment YAML file.
mkdir -p backup cp *.yml backup/ ./replace.sh
Run the following command to create EKS resources.
kubectl apply -f namespace.yml kubectl apply -f service-account.yml kubectl apply -f role.yml kubectl apply -f role-binding.yml
Run the following command to create secret for
docker-registry
.kubectl create secret docker-registry spark-plugin-docker-hub --docker-server=<PLEASE_CHANGE> --docker-username=<PLEASE_CHANGE> --docker-password='<PLEASE_CHANGE>' --namespace=<PLEASE_CHANGE>
Run the following command to deploy a sample Spark application. Replace
${SPARK_NAME_SPACE}
with the Kubernetes namespace.kubectl apply -f privacera-spark-examples.yml -n ${SPARK_NAME_SPACE}
Note
This is a sample file used for deployment. As per your use case, you can create a Spark deployment file and deploy a Docker image.
This will deploy a Spark application in EKS pod with Privacera plugin and it will keep the pod running, so that you can use it in interactive mode.
Oracle for Discovery
This topic describes how to connect Oracle application to the PrivaceraCloud Discovery service.
Prerequisites
Before connecting the Oracle application, make sure you have the following information available:
JDBC URL
JDBC Username
JDBC Password
Connect application
Go the Setting > Applications.
In the Applications screen, select Oracle.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Data Discovery for Oracle.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.
In the BASIC tab, enter the values in the following fields:
JDBC URL
JDBC Username
JDBC Password
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Add data source
To add a resources using this connection as Discovery targets, see Privacera Discovery scan targets
PostgreSQL
This topic describes how to connect PostgreSQL application to PrivaceraCloud.
Prerequisites
Create a database in PostgreSQL and get the database name and its URL:
On Amazon RDS, see Creating a PostgreSQL DB instance.
On Google Cloud Platform, see Creating a PostgreSQL DB.
Create a database user granting all privileges to fully access the database, and then get the user credentials to connect to the database.
If you choose to enable audits for PolicySync, ensure the following prerequisites are met:
On AWS, see Configure AWS RDS PostgreSQL instance for access audits. If you are using multiple AWS accounts, see Accessing Cross Account SQS Queue for PostgreSQL Audits.
On GCP, see Accessing PostgreSQL Audits in GCP.
Connect application
Go the Setting > Applications.
In the Applications screen, select PostgreSQL.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable Access Management for PostgreSQL.
In the BASIC tab, enter the values in the given fields and click Save. For property details and description, see table below:
Note
Make sure that the other properties are advanced and should be modified in consultation with Privacera.
Basic fields
Table 14. Basic fieldsField name
Type
Default
Required
Description
Postgres JDBC URL
string
Yes
Specifies the JDBC URL for the PostgreSQL connector.
Use the following format for the JDBC string:
jdbc:postgresql://<PG_SERVER_HOST>:<PG_SERVER_PORT>
Postgres jdbc username
string
Yes
Specifies the JDBC username to use.
Postgres jdbc password
string
Yes
Specifies the JDBC password to use.
Postgres default database
string
privacera_db
Yes
Specifies the name of the JDBC database to use.
Default password for new postgres user
string
Yes
Specifies the password to use when PolicySync creates new users.
Postgres resource owner
string
No
Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.
If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.
If a value is specified, the owner of the resource will be changed to the specified value.
The following resource types are supported:
Database
Schemas
Tables
Views
Databases to set access control policies
string
No
Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of databases might resemble the following:
testdb1,testdb2,sales db*
.If specified, Databases to ignore while setting access control policies takes precedence over this setting.
Enable policy enforcements and user/group/role management
boolean
true
No
Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is
true
.Enable access audits
boolean
false
Yes
Specifies whether Privacera fetches access audit data from the data source.
Audit source for postgres
string
sqs
No
Specifies the source for audit information. The following values are supported:
sqs
gcp_pgaudit
The default value is:
sqs
AWS access key to connect to sqs queue for access audits
string
No
Specifies the Amazon Web Services (AWS) access key that PolicySync uses to create an IAM client role to access the SQS queue to retrieve access audit information.
Specify this only if your deployment machine lacks an IAM role with the necessary permissions.
AWS secret access key to connect to sqs queue for access audits
string
No
Specifies the Amazon Web Services (AWS) secret key that PolicySync uses to create an IAM client role to access the SQS queue to retrieve access audit information.
Specify this only if your deployment machine lacks an IAM role with the necessary permissions.
AWS region of sqs queue
string
POSTGRES_AUDIT_SQS_QUEUE_REGION
No
Specifies the Amazon Web Services (AWS) SQS queue region.
AWS sqs queue name
string
POSTGRES_AUDIT_SQS_QUEUE_NAME
No
Specifies the Amazon Web Services (AWS) SQS queue name that PolicySync uses to retrieve access audit information.
GCP CloudSQL postgres instance id
string
No
Specifies the Google Cloud Platform SQL instance ID for the PostgreSQL server. PolicySync uses this instance ID for retrieving access audit information.
The instance ID must be provided in the following formation:
<PROJECT_ID>:<DB_INSTANCE_ID>
Advanced fields
Table 15. Advanced fieldsField name
Type
Default
Required
Description
Schemas to set access control policies
string
No
Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
Use the following format when specifying a schema:
<DATABASE_NAME>.<SCHEMA_NAME>
If specified, Schemas to ignore while setting access control policies takes precedence over this setting.
If you specify a wildcard, such as in the following example, all schemas are managed:
<DATABASE_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all schemas.
If set to
none
no schemas are managed.
Tables to set access control policies
string
No
Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
Use the following format when specifying a table:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
If specified,
ignore.table.list
takes precedence over this setting.If you specify a wildcard, such as in the following example, all matched tables are managed:
<DATABASE_NAME>.<SCHEMA_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all tables.
If set to
none
no tables are managed.
Databases to ignore while setting access control policies
string
No
Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.
For example:
testdb1,testdb2,sales_db*
This setting supersedes any values specified by Databases to set access control policies.
Schemas to ignore while setting access control policies
string
No
Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all schemas are subject to access control.
For example:
testdb1.schema1,testdb2.schema2,sales_db*.sales*
This setting supersedes any values specified by Schemas to set access control policies.
Regex to find special characters in user names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in user names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in group names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in group names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in role names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in role names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.
If not specified, no find and replace operation is performed.
Persist case sensitivity of user names
boolean
false
No
Specifies whether PolicySync converts user names to lowercase when creating local users. If set to
true
, case sensitivity is preserved.Persist case sensitivity of group names
boolean
false
No
Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to
true
, case sensitivity is preserved.Persist case sensitivity of role names
boolean
false
No
Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to
true
, case sensitivity is preserved.Create users in postgres by policysync
boolean
true
No
Specifies whether PolicySync creates local users for each user in Privacera.
Create user roles in postgres by policysync
boolean
true
No
Specifies whether PolicySync creates local roles for each user in Privacera.
Manage users from portal
boolean
true
No
Specifies whether PolicySync maintains user membership in roles in the PostgreSQL data source.
Manage groups from portal
boolean
true
No
Specifies whether PolicySync creates groups from Privacera in the PostgreSQL data source.
Manage roles from portal
boolean
true
No
Specifies whether PolicySync creates roles from Privacera in the PostgreSQL data source.
Users to set access control policies
string
No
Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
If not specified, PolicySync manages access control for all users.
If specified, Users to be ignored by access control policies takes precedence over this setting.
An example user list might resemble the following:
user1,user2,dev_user*
.Groups to set access control policies
string
No
Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of projects might resemble the following:
group1,group2,dev_group*
.If specified, Groups be ignored by access control policies takes precedence over this setting.
Roles to set access control policies
string
No
Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of projects might resemble the following:
role1,role2,dev_role*
.If specified, Roles be ignored by access control policies takes precedence over this setting.
Users to be ignored by access control policies
string
No
Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.
This setting supersedes any values specified by Users to set access control policies.
Groups be ignored by access control policies
string
No
Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.
This setting supersedes any values specified by Groups to set access control policies.
Roles be ignored by access control policies
string
No
Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.
This setting supersedes any values specified by Roles to set access control policies.
Prefix of postgres roles for portal users
string
priv_user_
No
Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named
<USER>
defined in Privacera and the role prefix ispriv_user_
, the local role is namedpriv_user_<USER>
.Prefix of postgres roles for portal groups
string
priv_group_
No
Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named
etl_users
defined in Privacera and the role prefix isprefix_
, the local role is namedprefix_etl_users
.Prefix of postgres roles for portal roles
string
priv_role_
No
Specifies the prefix that PolicySync uses when creating roles from Privacera in the PostgreSQL data source.
For example, if you have a role in Privacera named
finance
defined in Privacera and the role prefix isrole_prefix_
, the local role is namedrole_prefix_finance
.Use postgres native public group for public group access policies
boolean
true
No
Specifies whether PolicySync uses the PostgreSQL native public group for access grants whenever a policy refers to a public group. The default value is true.
Set access control policies only on the users from managed groups
boolean
false
No
Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.
Set access control policies only on the users/groups from managed roles
boolean
false
No
Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.
Enforce postgres native row filter
boolean
false
No
Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.
Enforce masking policies using secure views
boolean
true
No
Specifies whether to use secure view based masking. The default value is
true
.Because PolicySync does not support native masking for PostgreSQL, enabling this setting is recommended.
Enforce row filter policies using secure views
boolean
true
No
Specifies whether to use secure view based row filtering. The default value is
true
.While PostgreSQL supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.
Create secure view for all tables/views
boolean
true
No
Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.
Default masked value for numeric datatype columns
integer
0
No
Specifies the default masking value for numeric column types.
Default masked value for text/varchar datatype columns
string
<MASKED>
No
Specifies the default masking value for text and string column types.
Secure view name prefix
string
No
Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view name for a table namedexample1
isdev_example1
.Secure view name postfix
string
_secure
No
Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a table namedexample1
isexample1_dev
.Secure view schema name prefix
string
No
Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view schema name for a schema namedexample1
isdev_example1
.Secure view schema name postfix
string
No
Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a schema namedexample1
isexample1_dev
.Enable dataadmin
boolean
true
No
This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.
Users to exclude when fetching access audits
string
POSTGRES_JDBC_USERNAME
No
Specifies a comma separated list of users to exclude when fetching access audits. For example:
"user1,user2,user3"
.Custom fields
Table 16. Custom fieldsCanonical name
Type
Default
Description
load.resources
string
load_from_database_columns
Specifies how PolicySync loads resources from PostgreSQL. The following values are allowed:
load_md
: Load resources from PostgreSQL with a top-down resources approach, that is, it first loads the databases and then the schemas followed by tables and its columns.load_from_database_columns
: Load resources one by one for each resource type that is, it loads all databases first, then it loads all schemas in all databases, followed by all tables in all schemas and its columns. This mode is recommended since it is faster than the load mode.
sync.interval.sec
integer
60
Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.
sync.serviceuser.interval.sec
integer
420
Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.
sync.servicepolicy.interval.sec
integer
540
Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.
audit.interval.sec
integer
30
Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.
ignore.table.list
string
Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
This setting supersedes any values specified by Tables to set access control policies.
user.name.case.conversion
string
lower
Specifies how user name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of user names is set to
true
.group.name.case.conversion
string
lower
Specifies how group name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of group names is set to
true
.role.name.case.conversion
string
lower
Specifies how role name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of role names is set to
true
.policy.name.separator
string
_priv_
Specifies a string to use as part of the name of native row filter and masking policies.
row.filter.policy.name.template
string
{database}{separator}{schema}{separator}{table}
Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table
data
from theschema
schema that resides in thedb
database, the row filter policy name might resemble the following:db_priv_schema_priv_data_<ROW_FILTER_ITEM_NUMBER>
secure.view.name.remove.suffix.list
string
Specifies a suffix to remove from a table or view name. For example, if the table is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
secure.view.schema.name.remove.suffix.list
string
Specifies a suffix to remove from a schema name. For example, if a schema is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
perform.grant.updates.max.retry.attempts
integer
2
Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is
2
.aws.sqs.queue.endpoint
string
Specifies the SQS endpoint URL on Amazon Web Services (AWS). You must specify this value if you use a private VPC in your AWS account that is not available on the Internet.
aws.sqs.queue.max.poll.messages
integer
100
Specifies the number of messages to retrieve from the SQS queue at one time for audit information.
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Accessing PostgreSQL Audits in GCP
Prerequisites
Ensure the following prerequisites are met:
gcloud command-line tool is installed. See gcloud tool overview
Google Cloud SDK is installed. See Installing Cloud SDK
Configuration
In GCP:
Run the following commands on Google Cloud's shell (gcloud) by providing
GCP_PROJECT_ID
andINSTANCE_NAME
.gcloud sql instances patch {INSTANCE_NAME} --database-flags=cloudsql.enable_pgaudit=on,pgaudit.log=all --project {GCP_PROJECT_ID}
Run a SQL command using a compatible psql client to create the pgAudit extension.
CREATE EXTENSION pgaudit;
Create a service account and private key JSON file, which will be used by PolicySync to pull access audits. See Setting up authentication and edit the following fields:
Service account name: Enter any user-defined name. For example, policysync-postgres-gcp-audit-service-account.
Select a role: Select Private Logs Viewer role.
Create new key: Create a service account key and download the JSON file in the custom-vars folder.
In Privacera Manager:
Add the following properties in
vars.policysync.postgres.yml
file:POSTGRES_AUDIT_SOURCE:"gcp_pgaudit" POSTGRES_GCP_AUDIT_SOURCE_INSTANCE_ID:"" POSTGRES_OAUTH_PRIVATE_KEY_FILE_NAME:""
Configure AWS RDS PostgreSQL instance for access audits
You can configure your AWS account to allow Privacera to access your RDS PostgreSQL instance audit logs through Amazon cloudWatch logs. To enable this functionality, you must make the following changes in your account:
Update the AWS RDS parameter group for the database
Create an AWS SQS queue
Specify an AWS Lambda function
Create an IAM role for an EC2 instance
Update the AWS RDS parameter group for the database
To expose access audit logs, you must update configuration for the data source.
Procedure
To create a role for audits, run the following SQL query with a user with administrative credentials for your data source:
CREATE ROLE rds_pgaudit;
Create a new parameter group for your database and specify the following values:
Parameter group family: Select a database from either the
aurora-postgresql
orpostgres
families.Type: Select DB Parameter Group.
Group name: Specify a group name for the parameter group.
Description: Specify a description for the parameter group.
Edit the parameter group that you created in the previous step and set the following values:
pgaudit.log
: Specifyall
, overwriting any existing value.shared_preload_libraries
: Specifypg_stat_statements,pgaudit
.pgaudit.role
: Specifyrds_pgaudit
.
Associate the parameter group that you created with your database. Modify the configuration for the database instance and make the following changes:
DB parameter group: Specify the parameter group you created in this procedure.
PostgreSQL log: Ensure this option is set to enable logging to Amazon cloudWatch logs.
When prompted, choose the option to immediately apply the changes you made in the previous step.
Restart the database instance.
Verification
To verify that your database instance logs are available, complete the following steps:
From the Amazon RDS console, View the logs for your database instance from the RDS console.
From the CloudWatch console, complete the following steps:
Find the
/aws/rds/cluster/*
log group that corresponds to your database instance.Click the log group name to confirm that a log stream exists for the database instance, and then click on a log stream name to confirm that log messages are present.
Create an AWS SQS queue
To create an SQS queue used by an AWS Lambda function that you will create later, complete the following steps.
From the AWS console, create a new Amazon SQS queue with the default settings. Use the following format when specifying a value for the Name field:
privacera-postgres-<RDS_CLUSTER_NAME>-audits
where:
RDS_CLUSTER_NAME
: Specifies the name of your RDS cluster.
After the queue is created save the URL of the queue for use later.
Specify an AWS Lambda function
To create an AWS Lambda function to interact with the SQS queue, complete the following steps. In addition to creating the function, you must create a new IAM policy and associate a new IAM role with the function. You need to know your AWS account ID and AWS region to complete this procedure.
From the IAM console, create a new IAM policy and input the following JSON:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "logs:CreateLogGroup", "Resource": "arn:aws:logs:<REGION>:<ACCOUNT_ID>:*" }, { "Effect": "Allow", "Action": [ "logs:CreateLogStream", "logs:PutLogEvents" ], "Resource": [ "arn:aws:logs:<REGION>:<ACCOUNT_ID>:log-group:/aws/lambda/<LAMBDA_FUNCTION_NAME>:*" ] }, { "Effect": "Allow", "Action": "sqs:SendMessage", "Resource": "arn:aws:sqs:<REGION>:<ACCOUNT_ID>:<SQS_QUEUE_NAME>" } ] }
where:
REGION
: Specify your AWS region.ACCOUNT_ID
: Specify your AWS account ID.LAMBDA_FUNCTION_NAME
: Specify the name of the AWS Lambda function, which you will create later. For example:privacera-postgres-cluster1-audits
SQS_QUEUE_NAME
: Specify the name of the AWS SQS Queue.
Specify a name for the IAM policy, such as
privacera-postgres-audits-lambda-execution-policy
, and then create the policy.From the IAM console, create a new IAM role and choose for the Use case the Lambda option.
Search for the IAM policy that you just created with a name that might be similar to
privacera-postgres-audits-lambda-execution-policy
and select it.Specify a Role name for the IAM policy, such as
privacera-postgres-audits-lambda-execution-role
, and then create the role.From the AWS Lambda console, create a new function and specify the following fields:
Function name: Specify a name for the function, such as
privacera-postgres-cluster1-audits
.Runtime: Select Node.js 12.x from the list.
Permissions: Select Use an existing role and choose the role created earlier in this procedure, such as
privacera-postgres-audits-lambda-execution-role
.
Add a trigger to the function you created in the previous step and select CloudWatch Logs from the list, and then specify the following values:
Log group: Select the log group path for your Amazon RDS database instance, such as
/aws/rds/cluster/database-1/postgresql
.Filter name: Specify
auditTrigger
.
In the Lambda source code editor, provide the following JavaScript code in the
index.js
file, which is open by default in the editor:var zlib = require('zlib'); // CloudWatch logs encoding var encoding = process.env.ENCODING || 'utf-8'; // default is utf-8 var awsRegion = process.env.REGION || 'us-east-1'; var sqsQueueURL = process.env.SQS_QUEUE_URL; var ignoreDatabase = process.env.IGNORE_DATABASE; var ignoreUsers = process.env.IGNORE_USERS; var ignoreDatabaseArray = ignoreDatabase.split(','); var ignoreUsersArray = ignoreUsers.split(','); // Import the AWS SDK const AWS = require('aws-sdk'); // Configure the region AWS.config.update({region: awsRegion}); exports.handler = function (event, context, callback) { var zippedInput = Buffer.from(event.awslogs.data, 'base64'); zlib.gunzip(zippedInput, function (e, buffer) { if (e) { callback(e); } var awslogsData = JSON.parse(buffer.toString(encoding)); // Create an SQS service object const sqs = new AWS.SQS({apiVersion: '2012-11-05'}); console.log(awslogsData); if (awslogsData.messageType === 'DATA_MESSAGE') { // Chunk log events before posting awslogsData.logEvents.forEach(function (log) { //// Remove any trailing \n console.log(log.message) // Checking if message falls under ignore users/database var sendToSQS = true; if(sendToSQS) { for(var i = 0; i < ignoreDatabaseArray.length; i++) { if(log.message.toLowerCase().indexOf("@" + ignoreDatabaseArray[i]) !== -1) { sendToSQS = false; break; } } } if(sendToSQS) { for(var i = 0; i < ignoreUsersArray.length; i++) { if(log.message.toLowerCase().indexOf(ignoreUsersArray[i] + "@") !== -1) { sendToSQS = false; break; } } } if(sendToSQS) { let sqsOrderData = { MessageBody: JSON.stringify(log), MessageDeduplicationId: log.id, MessageGroupId: "Audits", QueueUrl: sqsQueueURL }; // Send the order data to the SQS queue let sendSqsMessage = sqs.sendMessage(sqsOrderData).promise(); sendSqsMessage.then((data) => { console.log("Sent to SQS"); }).catch((err) => { console.log("Error in Sending to SQS = " + err); }); } }); } }); };
For the Lambda function, edit the environment variables and create the following environment variables:
REGION
: Specify your AWS region.SQS_QUEUE_URL
: Specify your AWS SQS queue URL.IGNORE_DATABASE
: Specifyprivacera_db
.IGNORE_USERS
: Specify your database administrative user, such asprivacera
.
Create an IAM role for an EC2 instance
To create an IAM role for the AWS EC2 instance where you installed Privacera so that Privacera can read the AWS SQS queue, complete the following steps:
From the IAM console, create a new IAM policy and input the following JSON:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "sqs:DeleteMessage", "sqs:GetQueueUrl", "sqs:ListDeadLetterSourceQueues", "sqs:ReceiveMessage", "sqs:GetQueueAttributes" ], "Resource": "<SQS_QUEUE_ARN>" }, { "Effect": "Allow", "Action": "sqs:ListQueues", "Resource": "*" } ] }
where:
SQS_QUEUE_ARN
: Specifies the AQS SQS Queue ARN identifier for the SQS Queue you created earlier.
Specify a name for the IAM policy, such as
postgres-audits-sqs-read-policy
, and create the policy.Attach the IAM policy to the AWS EC2 instance where you installed Privacera.
Accessing Cross Account SQS Queue for PostgreSQL Audits
Prerequisites
Ensure the following prerequisites are met:
Access to AWS account with EC2 instance where Privacera Manager is configured.
Access to AWS account where SQS Queue is configured.
Configuration
Get the ARN of the account where the EC2 instance is running.
Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
In the navigation pane, choose Instances.
Search for your instance and select it.
In the Security tab, click the link in the IAM Role.
Copy the ARN of the IAM Role.
Get the ARN of the account where the SQS Queue instance is configured.
Open the Amazon SQS console at https://console.aws.amazon.com/sqs/.
From the left navigation pane, choose Queues. From the queue list, select the queue that you created.
In the Details section, copy the ARN of the queue.
Add the policy in the AWS SQS account to grant permissions to the AWS EC2 account.
Open the Amazon SQS console at https://console.aws.amazon.com/sqs/.
In the navigation pane, choose Queues.
Choose a queue and choose Edit.
Scroll to the Access policy section.
Add the access policy statements in the input box.
{"Version":"2012-10-17","Id":"PolicyAllowSQS","Statement":[{"Sid":"StmtAllowSQS","Effect":"Allow","Principal":{"AWS":"${EC2_INSTANCE_ROLE_ARN}"},"Action":["sqs:DeleteMessage","sqs:GetQueueUrl","sqs:ListDeadLetterSourceQueues","sqs:ReceiveMessage","sqs:GetQueueAttributes"],"Resource":"${SQS_QUEUE_ARN}"}]}
When you finish configuring the access policy, choose Save.
After saving, copy the SQS queue URL in the Details section.
Add the SQS queue URL.
Run the following command.
cd ~/privacera/privacera-manager/ vi config/custom-vars/vars.policysync.postgres.yml
Add the URL in the following property.
POSTGRES_AUDIT_SQS_QUEUE_NAME:"${SQS_QUEUE_URL}"
Power BI
This topic describes how to connect a Power BIapplication to PrivaceraCloud.
Connect Application
Go to Settings -> Applications.
On the Applications screen, select Power BI.
Enter the application Name and Description, and then click SAVE.
Click the toggle button to enable Access Management for Power BI.
In the BASIC tab, enter the values in the required(*) fields and click SAVE.
In the ADVANCED tab, you can add custom properties.
Caution
Advanced properties should be modified in consultation with Privacera.
Click the IMPORT PROPERTIES link to browse and import application properties.
Connector properties
Basic fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
Power BI authenticated user |
| Yes | Specifies the authentication username. If you do not specify this value, you must specify a secret for Power BI application client secret. | |
Power BI authenticated user's password |
| Yes | Specifies the authentication password. If you do not specify this value, you must specify a secret for Power BI application client secret. | |
Power BI application tenant id |
| Yes | Specifies the tenant ID associated with your Microsoft Azure account. | |
Power BI application client id |
| Yes | Specifies the principal ID for authentication. | |
Power BI application client secret |
| Yes | Specifies a client secret for authentication. If you do not specify this value, you must specify both Power BI authenticated user and Power BI authenticated user's password. | |
Workspaces to set access control policies |
| No | Specifies a comma-separated list of workspace names for which PolicySync manages access control. If unset, access control is managed for all workspaces. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of workspaces might resemble the following: If specified, Workspaces to ignore while setting access control policies takes precedence over this setting. | |
Enable policy enforcements and user/group/role management |
|
| No | Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is |
Enable access audits |
|
| Yes | Specifies whether Privacera fetches access audit data from the data source. |
Advanced fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
Workspaces to ignore while setting access control policies |
| No | Specifies a comma-separated list of workspace names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all workspaces are subject to access control. This setting supersedes any values specified by Workspaces to set access control policies. | |
Regex to find special characters in user names |
|
| No | Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting. If not specified, no find and replace operation is performed. |
String to replace with the special characters found in user names |
|
| No | Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting. If not specified, no find and replace operation is performed. |
Regex to find special characters in group names |
|
| No | Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting. If not specified, no find and replace operation is performed. |
String to replace with the special characters found in group names |
|
| No | Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting. If not specified, no find and replace operation is performed. |
Persist case sensitivity of user names |
|
| No | Specifies whether PolicySync converts user names to lowercase when creating local users. If set to |
Persist case sensitivity of group names |
|
| No | Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to |
Users to set access control policies |
| No | Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive. If not specified, PolicySync manages access control for all users. If specified, Users to be ignored by access control policies takes precedence over this setting. An example user list might resemble the following: | |
Groups to set access control policies |
| No | Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of projects might resemble the following: If specified, Groups be ignored by access control policies takes precedence over this setting. | |
Users to be ignored by access control policies |
| No | Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control. This setting supersedes any values specified by Users to set access control policies. | |
Groups be ignored by access control policies |
| No | Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control. This setting supersedes any values specified by Groups to set access control policies. | |
Set access control policies only on the users from managed groups |
|
| No | Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false. |
Custom fields
Canonical name | Type | Default | Description |
---|---|---|---|
|
|
| Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly. |
|
|
| Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera. |
|
|
| Set this property to true if you only want to manage users who have an email address associated with them in the portal. |
|
|
| Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Microsoft Power BI. |
Presto
This topic describes how to connect the Presto application to PrivaceraCloud and how PrivaceraCloud integrates with your Qubole Presto cluster using a plug-In.
Connect application
Go to Settings > Applications.
On the Applications screen, select Presto.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management and Privacera Discovery with the toggle buttons.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.
You only need to enable Privacera Access Management to start controlling access on Presto.
Click the toggle button to enable Privacera Access Management for your application.
You will see this message: Save the setting to start controlling access on Presto.
Click Save.
Click the toggle button to enable Data Discovery for your application.
On the BASIC tab, enter values in the following fields.
JDBC URL
JDBC Username
JDBC Password
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
To add a resources using this connection as Privacera Discovery targets, see Discovery scan targets.
Connect Presto on Qubole cluster PrivaceraCloud
PrivaceraCloud uses a Plug-in to integrate with your Qubole Presto cluster.
Connecting your Qubole Presto cluster to PrivaceraCloud consists of the following steps:
Create a service user on PrivaceraCloud for data user access control call-in from Presto to PrivaceraCloud.
Create, or identify and use an existing, unique call-in authentication (access control) and audit URLs from your Qubole Presto cluster to PrivaceraCloud.
Configure your Qubole Presto cluster to first load the necessary Privacera hosted Apache Ranger Plug-in components (on boot), and execute the call-in for access control and audit.
Create a new data access service user for interaction with Qubole.
Open Access Manager: Users/Groups/Roles and Click + Add.
Create a new service data access user. Assign it to an Admin role. Record the User Name and Password.
These are referred to as ADMIN_ROLE_USER and ADMIN_ROLE_PASSWORD in the following steps and will be substituted in configuration properties.
Obtain API Key associated Ranger URLs for call back from Qubole cluster to Privacera.
Open Settings: Api Key.
You can use an existing Active API Key or create a new one. Expiry = Never Expires is recommended. To generate new API key, see API Key.
Click the i icon to see the API Key Info.
Copy and store the values for each of the Ranger Admin URL and Ranger Audit URL. These will be referenced as
RANGER_ADMIN_URL
andRANGER_AUDIT_URL
in the following steps.
Open or create a new Presto cluster.
Proceed to Advanced Configuration.
In the PRESTO SETTINGS > Override Presto Configuration text box, add the following information. Substitute values obtained above for
ADMIN_ROLE_USER
,ADMIN_ROLE_PASSWORD
,RANGER_ADMIN_URL
, andRANGER_AUDIT_URL
.bootstrap.properties: mkdir -p /media/ephemeral0/rangerssl/ hadoop credential create sslTrustStore -value changeit -provider localjceks://file/media/ephemeral0/rangerssl/ranger.jceks chmod a+r /media/ephemeral0/rangerssl/ranger.jceks wget https://privacera-public1.s3.amazonaws.com/0001-httpcore-4.4.14.jar -P /usr/lib/presto/plugin/ranger access-control.properties: access-control.name=ranger-access-control ranger.username=<ADMIN_ROLE_USER> ranger.password=<ADMIN_ROLE_USER_PASSWORD> ranger.hive.security-config-xml=/usr/lib/presto/etc/ranger-hive-security.xml ranger.hive.audit-config-xml=/usr/lib/presto/etc/ranger-hive-audit.xml ranger-hive-security.xml: <configuration> <property> <name>ranger.plugin.hive.service.name</name> <value>privacera_hive</value> </property> <property> <name>ranger.plugin.hive.policy.pollIntervalMs</name> <value>5000</value> </property> <property> <name>ranger.service.store.rest.url</name> <value> <RANGER_ADMIN_URL> </value> </property> <property> <name>ranger.plugin.hive.policy.rest.url</name> <value> <RANGER_ADMIN_URL> </value> </property> <property> <name>ranger.service.store.rest.ssl.config.file</name> <value>/usr/lib/presto/etc/ranger-ssl.xml</value> </property> <property> <name>ranger.plugin.hive.policy.rest.ssl.config.file</name> <value>/usr/lib/presto/etc/ranger-ssl.xml</value> </property> </configuration> ranger-ssl.xml: <configuration> <property> <name>xasecure.policymgr.clientssl.truststore</name> <value>/etc/pki/ca-trust/extracted/java/cacerts</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.password</name> <value>crypted</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.credential.file</name> <value>jceks://file/media/ephemeral0/rangerssl/ranger.jceks</value> </property> </configuration> ranger-hive-audit.xml: <configuration> <property> <name>xasecure.audit.is.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.solr.is.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.solr.async.max.queue.size</name> <value>1</value> </property> <property> <name>xasecure.audit.solr.async.max.flush.interval.ms</name> <value>1000</value> </property> <property> <name>xasecure.audit.solr.solr_url</name> <value> <RANGER_AUDIT_URL> </value> </property> </configuration>
Click Update or Update and Push.
Click Start or Stop and start the cluster.
Redshift
This topic describes how to connect Redshift application to the PrivaceraCloud.
Connect application
Go to Settings > Applications.
On the Applications screen, select Redshift.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management and Data Discovery with toggle buttons.
Note
If you don't see Data Discoveryin your application, enable it in Settings > Account > Discovery. For more information, see About Account.
Click the toggle button to enable the Privacera Access Managementfor your application.
On the BASIC tab, enter the values in the given fields and click Save. For property details and description, see table below:
Note
Make sure that the other properties are advanced and should be modified in consultation with Privacera.
Basic fields
Table 20. Basic fieldsField name
Type
Default
Required
Description
Redshift JDBC URL
string
Yes
Specifies the JDBC URL for the Amazon Redshift connector.
Redshift jdbc username
string
Yes
Specifies the JDBC username to use.
For PolicySync to push policies to Amazon Redshift, this user must have superuser privileges.
Redshift jdbc password
string
Yes
Specifies the JDBC password to use.
Redshift default database
string
Yes
Specifies the name of the JDBC database to use.
PolicySync also uses the connection to this database to load metadata and create principals such as users and groups.
Default password for new redshift user
string
Yes
Specifies the password to use when PolicySync creates new users.
The password must meet the following requirements:
It must be between 8 and 64 characters long.
It must contain at least one uppercase letter, one lowercase letter, and one number.
It can use any ASCII character with the ASCII codes 33–126 except:
'
,"
,,
,/
, or@
Redshift resource owner
string
No
Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.
If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.
If a value is specified, the owner of the resource will be changed to the specified value.
The following resource types are supported:
Database
Schemas
Tables
Views
Databases to set access control policies
string
No
Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of databases might resemble the following:
testdb1,testdb2,sales db*
.If specified, Databases to ignore while setting access control policies takes precedence over this setting.
Enable policy enforcements and user/group/role management
boolean
false
No
Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is
false
.Enable access audits
boolean
false
No
Specifies whether Privacera fetches access audit data from the data source.
Advanced fields
Table 21. Advanced fieldsField name
Type
Default
Required
Description
Schemas to set access control policies
string
No
Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
Use the following format when specifying a schema:
<DATABASE_NAME>.<SCHEMA_NAME>
If specified, Schemas to ignore while setting access control policies takes precedence over this setting.
If you specify a wildcard, such as in the following example, all schemas are managed:
<DATABASE_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all schemas.
If set to
none
no schemas are managed.
Tables to set access control policies
string
No
Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
Use the following format when specifying a table:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
If specified,
ignore.table.list
takes precedence over this setting.If you specify a wildcard, such as in the following example, all matched tables are managed:
<DATABASE_NAME>.<SCHEMA_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all tables.
If set to
none
no tables are managed.
Databases to ignore while setting access control policies
string
No
Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.
For example:
testdb1,testdb2,sales_db*
This setting supersedes any values specified by Databases to set access control policies.
Schemas to ignore while setting access control policies
string
No
Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all schemas are subject to access control.
For example:
testdb1.schema1,testdb2.schema2,sales_db*.sales*
This setting supersedes any values specified by Schemas to set access control policies.
Regex to find special characters in user names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in user names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in group names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in group names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in role names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in role names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.
If not specified, no find and replace operation is performed.
Persist case sensitivity of user names
boolean
false
No
Specifies whether Amazon Redshift supports case sensitivity for users. Because case sensitivity in Amazon Redshift is global, enabling this enables case sensitivity for users, groups, roles, and resources.
Persist case sensitivity of group names
boolean
false
No
Specifies whether Amazon Redshift supports case sensitivity for groups. Because case sensitivity in Amazon Redshift is global, enabling this enables case sensitivity for users, groups, roles, and resources.
Persist case sensitivity of role names
boolean
false
No
Specifies whether Amazon Redshift supports case sensitivity for roles. Because case sensitivity in Amazon Redshift is global, enabling this enables case sensitivity for users, groups, roles, and resources.
Enable Case Sensitive Identifier for Reosurces
boolean
false
No
Specifies whether Amazon Redshift preserves case for user, group, role, and resource names. By default, Amazon Redshift converts all user, group, role, and resource names to lowercase. If set to
true
, PolicySync enables case sensitivity on a per connection basis.Enable Case Sensitive Identifier for Reosurces Query
string
SET enable_case_sensitive_identifier=true;
No
Specifies a query for Amazon Redshift that enables case sensitivity per connection. If you enable Enable Case Sensitive Identifier for Reosurces, then this setting defines the query that PolicySync runs.
Create users in redshift by policysync
boolean
true
No
Specifies whether PolicySync creates local users for each user in Privacera.
Create user roles in redshift by policysync
boolean
true
No
Specifies whether PolicySync creates local roles for each user in Privacera.
Manage users from portal
boolean
true
No
Specifies whether PolicySync maintains user membership in roles in the Amazon Redshift data source.
Manage groups from portal
boolean
true
No
Specifies whether PolicySync creates groups from Privacera in the Amazon Redshift data source.
Manage roles from portal
boolean
true
No
Specifies whether PolicySync creates roles from Privacera in the Amazon Redshift data source.
Users to set access control policies
string
No
Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
If not specified, PolicySync manages access control for all users.
If specified, Users to be ignored by access control policies takes precedence over this setting.
An example user list might resemble the following:
user1,user2,dev_user*
.Groups to set access control policies
string
No
Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of projects might resemble the following:
group1,group2,dev_group*
.If specified, Groups be ignored by access control policies takes precedence over this setting.
Roles to set access control policies
string
No
Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of projects might resemble the following:
role1,role2,dev_role*
.If specified, Roles be ignored by access control policies takes precedence over this setting.
Users to be ignored by access control policies
string
No
Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.
This setting supersedes any values specified by Users to set access control policies.
Groups be ignored by access control policies
string
No
Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.
This setting supersedes any values specified by Groups to set access control policies.
Roles be ignored by access control policies
string
No
Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.
This setting supersedes any values specified by Roles to set access control policies.
Prefix of redshift roles for portal users
string
priv_user_
No
Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named
<USER>
defined in Privacera and the role prefix ispriv_user_
, the local role is namedpriv_user_<USER>
.Prefix of redshift roles for portal groups
string
priv_group_
No
Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named
etl_users
defined in Privacera and the role prefix isprefix_
, the local role is namedprefix_etl_users
.Prefix of redshift roles for portal roles
string
priv_role_
No
Specifies the prefix that PolicySync uses when creating roles from Privacera in the Amazon Redshift data source.
For example, if you have a role in Privacera named
finance
defined in Privacera and the role prefix isrole_prefix_
, the local role is namedrole_prefix_finance
.Use redshift native public group for public group access policies
boolean
true
No
Specifies whether PolicySync uses the Amazon Redshift native public group for access grants whenever a policy refers to a public group. The default value is true.
Set access control policies only on the users from managed groups
boolean
false
No
Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false.
Set access control policies only on the users/groups from managed roles
boolean
false
No
Specifies whether to manage only users that are members of the roles specified by Roles to set access control policies. The default value is false.
Enforce masking policies using secure views
boolean
true
No
Specifies whether to use secure view based masking. The default value is
true
.Enforce row filter policies using secure views
boolean
true
No
Specifies whether to use secure view based row filtering. The default value is
true
.While Amazon Redshift supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.
Create secure view for all tables/views
boolean
true
No
Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.
Default masked value for numeric datatype columns
integer
0
No
Specifies the default masking value for numeric column types.
Default masked value for text/varchar datatype columns
string
<MASKED>
No
Specifies the default masking value for text and string column types.
Secure view name prefix
string
No
Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view name for a table namedexample1
isdev_example1
.Secure view name postfix
string
_secure
No
Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a table namedexample1
isexample1_dev
.Secure view schema name prefix
string
No
Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view schema name for a schema namedexample1
isdev_example1
.Secure view schema name postfix
string
No
Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a schema namedexample1
isexample1_dev
.Enable dataadmin
boolean
true
No
This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.
Users to exclude when fetching access audits
string
REDSHIFT_JDBC_USERNAME
No
Specifies a comma separated list of users to exclude when fetching access audits. For example:
"user1,user2,user3"
.Initial delay for access audit
integer
30
No
Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Amazon Redshift.
Custom fields
Table 22. Custom fieldsCanonical name
Type
Default
Description
load.resources
string
load_from_database_columns
Specifies how PolicySync loads resources from Amazon Redshift. The following values are allowed:
load_md
: Load resources from Amazon Redshift with a top-down resources approach, that is, it first loads the databases and then the schemas followed by tables and its columns.load_from_database_columns
: Load resources one by one for each resource type that is, it loads all databases first, then it loads all schemas in all databases, followed by all tables in all schemas and its columns. This mode is recommended since it is faster than the load mode.
sync.interval.sec
integer
60
Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.
sync.serviceuser.interval.sec
integer
420
Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.
sync.servicepolicy.interval.sec
integer
540
Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.
audit.interval.sec
integer
30
Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.
ignore.table.list
string
Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
This setting supersedes any values specified by Tables to set access control policies.
user.name.case.conversion
string
lower
Specifies how user name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of user names is set to
true
.group.name.case.conversion
string
lower
Specifies how group name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of group names is set to
true
.role.name.case.conversion
string
lower
Specifies how role name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of role names is set to
true
.secure.view.name.remove.suffix.list
string
Specifies a suffix to remove from a table or view name. For example, if the table is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
secure.view.schema.name.remove.suffix.list
string
Specifies a suffix to remove from a schema name. For example, if a schema is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
perform.grant.updates.max.retry.attempts
integer
2
Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is
2
.On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the toggle button to enable the Data Discovery for your application.
On the BASIC tab, enter values in the following fields.
JDBC URL
JDBC Username
JDBC Password
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Add Data Source
To add a resources using this connection as Discovery targets, see Privacera Discovery scan targets.
Redshift Spectrum
Redshift Spectrum
This topic describes how to configure access control for Redshift Spectrum PolicySync using PrivaceraCloud.
Privacera supports access control for Redshift Spectrum only on the following:
Create Database
Usage Schema
Prerequisites
The following prerequisites must be met to use the Redshift Spectrum:
You will require an Amazon Redshift cluster and a SQL client connected to the cluster.
The AWS Region in which the Amazon Redshift cluster and Amazon S3 bucket are located must be the same.
The Redshift application must be connected with PrivaceraCloud.
Getting started
Redshift Spectrum supports the creation of external tables within Redshift cluster in four simple steps:
Major Security Concern
Redshift does not support Access control lists (ACLs) on EXTERNAL TABLES
; to gain access to the data (EXTERNAL TABLES
), you must provide USAGE
schema permission on the EXTERNAL SCHEMA
.
Limitations
The following are the limitations with Redshift Spectrum:
If the
USAGE
permission is granted toEXTERNAL SCHEMA
, the user gains access to all of its tables.Access to any of the external tables cannot be explicitly granted or revoked.
The creation of Redshift managed tables (not
EXTERNAL TABLES
) is not permitted within an "EXTERNAL SCHEMA".The creation of secure views is not permitted within an
EXTERNAL SCHEMA
.
Privacera has never managed external tables due to the limitations listed above. By default, we manage permissions for external schemas at the schema level.
Support for Row Level Filter and Column Masking on the basis of Secure Views on EXTERNAL SCHEMA
is possible, but only with the user's CONSENT, as the user will also have direct access to the EXTERNAL TABLE
If they query the table's data, neither the Row Level Filter nor the Column Masking will be applied.
Note
We do not recommend this solution, but if you agree that users will not query the data directly (via external tables), we can enable it by adding the REDSHIFT_ENABLE_EXTERNAL_SCHEMA_SUPPORT
property (default behavior is set to false).
Proposed Solution
On an EXTERNAL TABLE
, we supports Row Level Filter and Column Masking to a limited extent.
Instead of creating a table, we create a secure view with the
_secure
postfix added to the schema name (as we cannot create Redshift views inside external schemas).To
GRANT
access to secure view, we must grantUSAGE
permission to the Source Schema because the secure view schema will be separated from theEXTERNAL SCHEMA
. As a result, permission is granted to the source (actual) table.Only Select Permission to the
EXTERNAL TABLE
is supported.DataAdmin
permission is ineffective becauseUSAGE
permission toEXTERNAL SCHEMA
allows direct access toEXTERNAL TABLE
.
Configuration
Note
Due to limitations, EXTERNAL SCHEMA
support for Row Level Filter and Column Masking is not recommended.
Enable external schema
To enable the external schema, perform the following steps:
Note
This Enable external schema toggle button should not be enabled without consent after reading the documentation.
Go to Settings > Applications.
Select the Redshift application, which is already linked to PrivaceraCloud.
Click the Account Name or the edit button for the account on which you want to enable Redshift Spectrum.
In the Access Management section, click the toggle button.
In the ADVANCED tab, click the Enable external schema toggle button.
In the Confirmation window, click YES, and then click SAVE.
Property Configuration
The values in the following fields must be left blank:
Secure view name prefix Secure view name postfix
The value of one of the following fields must be set:
Secure view schema name prefix Secure view schema name postfix
Kinesis
This topic describes how to connect Kinesis application to PrivaceraCloud.
Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.
Prerequisites in AWS console
The following prerequisites must be met:
Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.
Configure a Trust relationship with PrivaceraCloud See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.
Connect application
Go to Settings > Applications.
On the Applications screen, select Kinesis.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management and with the toggle buttons.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.
Enable Privacera Access Management
Click the toggle button to enable Privacera Access Management for your application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key.
AWS Secret Key: AWS data repository host account Secret Key
AWS Region: AWS S3 bucket region.
With Use IAM Role enabled:
AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.
AWS IAM Role External Id: For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.
Note
The external ID is stored encrypted. It is never reflected back to the UI or is made visible.
AWS Region: AWS S3 bucket region.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Note
You can only use one S3 setup per account for Privacera Access Management
Recommended: Install the AWS CLI.
Open Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud S3 Data Server proxy.
Recommended: Validate connectivity by running AWS CLI for S3 such as:
aws s3 ls
Note
Dataserver also supports logging the requested user's name in AWS CloudWatch Logs. For more information see Add UserInfo in S3 Requests sent via Dataserver.
Enable Data Discovery
Click the toggle button to enable Data Discovery for your application.
On the BASIC tab, enter values in the following fields.
With Use IAM Role disabled:
AWS Access Key: AWS data repository host account Access Key.
AWS Secret Key: AWS data repository host account Secret Key
AWS Region: AWS S3 bucket region.
With Use IAM Role enabled:
AWS IAM Role: Enter the actual IAM Role using a full AWS ARN.
AWS Region: AWS S3 bucket region.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Go to PrivaceraCloud > Privacera Discovery > Data Source to add a resources using this connection as Discovery targets. See Privacera Discovery scan targets for quick start steps.
Snowflake
This topic describes how to connect the Snowflake application to the PrivaceraCloud using the AWS and Azure platforms.
Prerequisites
Before connecting Snowflake application to PrivaceraCloud, you must first manually create the Snowflake warehouse, database, users, and roles required by PolicySync.
Connect application
Go to Settings > Applications.
On the Applications screen, select Snowflake.
Select the platform type (AWS or Azure) on which you want to configure the Snowflake application.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management and Data Discovery with toggle buttons.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account.
Enable Privacera Access Management
Click the toggle button to enable the Privacera Access Management for your application.
On the BASIC tab, enter the values in the given fields and click Save. For property details and description, see table below:
Note
Make sure that the other properties are advanced and should be modified in consultation with Privacera.
Basic fields
Table 23. Basic fieldsField name
Type
Default
Required
Description
Snowflake JDBC Url
string
Yes
Specifies the JDBC URL for the Snowflake connector.
Snowflake JDBC Username
string
Yes
Specifies the JDBC username to use.
Snowflake JDBC Password
string
Yes
Specifies the JDBC password to use.
Enable Use Key Pair Authentication
boolean
false
Yes
Specifies whether PolicySync uses key-pair authentication.
Enable this setting to true to enable key pair authentication.
Snowflake JDBC private key
string
No
Specifies the contents of the private key file to use with Snowflake. For example:
-----BEGIN ENCRYPTED PRIVATE KEY----- MIIE6TAbBgkqhkiG9w0BBQMwDgQILYPyCppzOwECAggABIIEyLiGSpeeGSe3xHP1wHLjfCYycUPennlX2bd8yX8xOxGSGfvB+99+PmSlex0FmY9ov1J8H1H9Y3lMWXbL... -----END ENCRYPTED PRIVATE KEY-----
Snowflake JDBC private key password
string
No
Specifies the password for the private key. If the private key does not have a password, do not specify this setting.
Snowflake Warehouse To Use
string
Yes
Specifies the JDBC warehouse that PolicySync establishes a connection to, which is used to run SQL queries.
Snowflake Role To Use
string
Yes
Specifies the role that PolicySync uses when it runs SQL queries.
Snowflake Resource Owner
string
No
Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.
If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.
If a value is specified, the owner of the resource will be changed to the specified value.
The following resource types are supported:
Database
Schemas
Tables
Views
Warehouses to set access control policies
string
No
Specifies a comma-separated list of warehouse names for which PolicySync manages access control. If unset, access control is managed for all warehouses. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of warehouses might resemble the following:
testdb1warehouse,testdb2warehouse, sales_dbwarehouse*
Databases to set access control policies
string
No
Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of databases might resemble the following:
testdb1,testdb2,sales db*
.If specified, Databases to be ignored by access policy takes precedence over this setting.
Default password for new snowflake user
string
Yes
Specifies the password to use when PolicySync creates new users.
Enable policy enforcements and user/group/role management
boolean
true
No
Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is
true
.Database name where masking function for column access control will be created
string
No
Specifies the name of the database where PolicySync creates custom masking functions.
Enable access audits
boolean
true
Yes
Specifies whether Privacera fetches access audit data from the data source.
Enable simple audits
boolean
true
No
Specifies whether to enable simple auditing. When enabled, PolicySync gathers the following audit information from the database:
RequestData (query text)
AccessResult (execute status)
AccessType (query type)
User (username)
ResourcePath (database_name.schema_name)
EventTime (query time)
AclEnforcer (connector name)
If you enabled this setting, do not enable Enable advance audits.
Enable advance audits
boolean
false
No
Specifies whether to enable advanced auditing. When enabled, PolicySync gathers the following audit information from the database:
AccessResult (execute status)
AccessType (query type)
User (username)
ResourcePath (database_name.schema_name.column_names)
EventTime (query time)
AclEnforcer (connector name)
If you enabled this setting, do not enable Enable simple audits.
Advanced fields
Table 24. Advanced fieldsField name
Type
Default
Required
Description
Schemas to set access control policies
string
No
Specifies a comma-separated list of schema names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
Use the following format when specifying a schema:
<DATABASE_NAME>.<SCHEMA_NAME>
If specified, Schemas to be ignored by access policy takes precedence over this setting.
If you specify a wildcard, such as in the following example, all schemas are managed:
<DATABASE_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all schemas.
If set to
none
no schemas are managed.
Tables to set access control policies
string
No
Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
Use the following format when specifying a table:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
If specified,
ignore.table.list
takes precedence over this setting.If you specify a wildcard, such as in the following example, all matched tables are managed:
<DATABASE_NAME>.<SCHEMA_NAME>.*
The specified value, if any, is interpreted in the following ways:
If unset, access control is managed for all tables.
If set to
none
no tables are managed.
Stream to set access control policies
string
No
Specifies a comma-separated list of stream names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of streams might resemble the following:
testdb1.schema1.stream1,testdb2.schema2.stream*
If unset, access control is managed for all streams.
Functions to set access control policies
string
No
Specifies a comma-separated list of function names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of functions might resemble the following:
testdb1.schema1.fn1,testdb2.schema2.fn*
If unset, access control is managed for all functions.
Procedures to set access control policies
string
No
Specifies a comma-separated list of procedure names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of procedures might resemble the following:
testdb1.schema1.procedureA,testdb2.schema2.procedure*
If unset, access control is managed for all procedures.
Sequences to set access control policies
string
No
Specifies a comma-separated list of sequence names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of sequences might resemble the following:
testdb1.schema1.seq1,testdb2.schema2.seq*
If unset, access control is managed for all sequences.
FileFormat to set access control policies
string
No
Specifies a comma-separated list of file format names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of file formats might resemble the following:
testdb1.schema1.fileFmtA,testdb2.schema2.fileFmt*
If unset, access control is managed for all file formats.
Pipes to set access control policies
string
No
Specifies a comma-separated list of pipe names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of pipes might resemble the following:
testdb1.schema1.pipeA,testdb2.schema2.pipe*
If unset, access control is managed for all pipes.
ExternalStage to set access control policies
string
No
Specifies a comma-separated list of external stage names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of external stages might resemble the following:
testdb1.schema1.externalStage1,testdb2.schema2.extStage*
If unset, access control is managed for all external stages.
InternalStage to set access control policies
string
No
Specifies a comma-separated list of internal stages names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
An example list of internal stages might resemble the following:
testdb1.schema1.internalStage1,testdb2.schema2.intStage*
If unset, access control is managed for all internal stages.
Warehouses to be ignored by access policy
string
No
Specifies a comma-separated list of warehouse names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all warehouses are subject to access control.
This setting supersedes any values specified by Warehouses to set access control policies.
Databases to be ignored by access policy
string
DEMO_DB,SNOWFLAKE,UTIL_DB,SNOWFLAKE_SAMPLE_DATA
No
Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.
For example:
testdb1,testdb2,sales_db*
This setting supersedes any values specified by Databases to set access control policies.
Schemas to be ignored by access policy
string
*.INFORMATION_SCHEMA
No
Specifies a comma-separated list of schema names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all schemas are subject to access control.
For example:
testdb1.schema1,testdb2.schema2,sales_db*.sales*
This setting supersedes any values specified by Schemas to set access control policies.
Create user in snowflake by policysync
boolean
true
No
Specifies whether PolicySync creates local users for each user in Privacera.
Create user role in snowflake by policysync
boolean
true
No
Specifies whether PolicySync creates local roles for each user in Privacera.
Enable use of email as login for snowflake
boolean
false
No
Specifies whether PolicySync uses the user email address as the login name when creating a new user in Snowflake.
Prefix of snowflake roles for portal users
string
No
Specifies the prefix that PolicySync uses when creating local users. For example, if you have a user named
<USER>
defined in Privacera and the role prefix ispriv_user_
, the local role is namedpriv_user_<USER>
.Prefix of snowflake roles for portal groups
string
No
Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named
etl_users
defined in Privacera and the role prefix isprefix_
, the local role is namedprefix_etl_users
.Prefix of snowflake roles for portal roles
string
No
Specifies the prefix that PolicySync uses when creating roles from Privacera in the Snowflake data source.
For example, if you have a role in Privacera named
finance
defined in Privacera and the role prefix isrole_prefix_
, the local role is namedrole_prefix_finance
.Manage users form portal
boolean
No
Specifies whether PolicySync maintains user membership in roles in the Snowflake data source.
Manage group form portal
boolean
No
Specifies whether PolicySync creates groups from Privacera in the Snowflake data source.
Manage role form portal
boolean
No
Specifies whether PolicySync creates roles from Privacera in the Snowflake data source.
Users to set access control policy
string
No
Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.
If not specified, PolicySync manages access control for all users.
If specified, Users to be ignored by access control policy takes precedence over this setting.
An example user list might resemble the following:
user1,user2,dev_user*
.Groups to set access control policy
string
No
Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of projects might resemble the following:
group1,group2,dev_group*
.If specified, Groups to be ignored by access control policy takes precedence over this setting.
Roles to set access control policy
string
No
Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.
An example list of projects might resemble the following:
role1,role2,dev_role*
.If specified, Roles to be ignored by access control policy takes precedence over this setting.
Users to be ignored by access control policy
string
No
Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.
This setting supersedes any values specified by Users to set access control policy.
Groups to be ignored by access control policy
string
No
Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.
This setting supersedes any values specified by Groups to set access control policy.
Roles to be ignored by access control policy
string
No
Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.
This setting supersedes any values specified by Roles to set access control policy.
Regex to find special characters in user names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the String to replace with the special characters found in user names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in user names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in user names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in group names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the String to replace with the special characters found in group names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in group names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in group names setting.
If not specified, no find and replace operation is performed.
Regex to find special characters in role names
string
[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]
No
Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the String to replace with the special characters found in role names setting.
If not specified, no find and replace operation is performed.
String to replace with the special characters found in role names
string
_
No
Specifies a string to replace the characters matched by the regex specified by the Regex to find special characters in role names setting.
If not specified, no find and replace operation is performed.
Persist case sensitivity of user names
boolean
false
No
Specifies whether PolicySync converts user names to lowercase when creating local users. If set to
true
, case sensitivity is preserved.Persist case sensitivity of group names
boolean
false
No
Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to
true
, case sensitivity is preserved.Persist case sensitivity of role names
boolean
false
No
Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to
true
, case sensitivity is preserved.Set access control policies only on the users from managed groups
boolean
false
No
Specifies whether to manage only the users that are members of groups specified by Groups to set access control policy. The default value is false.
Set access control policies only on the users/groups from managed roles
boolean
false
No
Specifies whether to manage only users that are members of the roles specified by Roles to set access control policy. The default value is false.
Enable Column Access Exception
boolean
true
No
Specifies whether an access denied exception is displayed if a user does not have access to a table column and attempts to access that column.
If enabled, you must set Enforce Snowflake Native Masking to
true
.Enforce Snowflake Native Masking
boolean
true
No
Specifies whether PolicySync enables native masking policy creation functionality.
Enforce Snowflake Native row filter
boolean
true
No
Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.
Enforce row filter policies using secure views
boolean
false
No
Specifies whether to use secure view based row filtering. The default value is
false
.While Snowflake supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.
Enforce masking policies using secure views
boolean
false
No
Specifies whether to use secure view based masking. The default value is
false
.Secure view schema name prefix
string
No
Specifies a prefix string to apply to a secure schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view schema name for a schema namedexample1
isdev_example1
.Secure view schema name postfix
string
No
Specifies a postfix string to apply to a secure view schema name. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a schema namedexample1
isexample1_dev
.Secure view name prefix
string
No
Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is
dev_
, then the secure view name for a table namedexample1
isdev_example1
.Secure view name postfix
string
_SECURE
No
Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.
If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is
_dev
, then the secure view name for a table namedexample1
isexample1_dev
.Create secure view for all tables/views
boolean
false
No
Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.
Default masked value for numeric datatype columns
integer
0
No
Specifies the default masking value for numeric column types.
Default masked value for text/varchar datatype columns
string
<MASKED>
No
Specifies the default masking value for text and string column types.
Custom fields
Table 25. Custom fieldsCanonical name
Type
Default
Description
jdbc.maximum.pool.size
integer
15
Specifies the maximum size for the JDBC connection pool.
jdbc.min.idle.connection
integer
3
Specifies the minimum size of the JDBC connection pool.
jdbc.leak.detection.threshold
string
900000L
Specifies the duration in milliseconds that a connection is not part of the connection pool before PolicySync logs a possible connection leak message. If set to
0
, leak detection is disabled.handle.pipe.ownership
boolean
false
Specifies whether PolicySync changes the ownership of a pipe to the role specified by Snowflake Resource Owner.
ignore.table.list
string
Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:
<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>
This setting supersedes any values specified by Tables to set access control policies.
ignore.stream.list
string
Specifies a comma-separated list of stream names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all streams are subject to access control.
This setting supersedes any values specified by Stream to set access control policies.
ignore.function.list
string
Specifies a comma-separated list of functions names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all functions are subject to access control.
This setting supersedes any values specified by Functions to set access control policies.
ignore.procedure.list
string
Specifies a comma-separated list of procedures names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all procedures are subject to access control.
This setting supersedes any values specified by Procedures to set access control policies.
ignore.sequence.list
string
Specifies a comma-separated list of sequences names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all sequences are subject to access control.
This setting supersedes any values specified by Sequences to set access control policies.
ignore.file_format.list
string
Specifies a comma-separated list of file format names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all file formats are subject to access control.
This setting supersedes any values specified by FileFormat to set access control policies.
ignore.pipe.list
string
Specifies a comma-separated list of pipes names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all pipes are subject to access control.
This setting supersedes any values specified by Pipes to set access control policies.
ignore.external_stage.list
string
Specifies a comma-separated list of external stage names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all external stages are subject to access control.
This setting supersedes any values specified by ExternalStage to set access control policies.
ignore.internal_stage.list
string
Specifies a comma-separated list of internal stage names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all internal stages are subject to access control.
This setting supersedes any values specified by InternalStage to set access control policies.
user.name.case.conversion
string
lower
Specifies how user name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of user names is set to
true
.group.name.case.conversion
string
lower
Specifies how group name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of group names is set to
true
.role.name.case.conversion
string
lower
Specifies how role name conversions are performed. The following options are valid:
lower
: Convert to lowercaseupper
: Convert to uppercasenone
: Preserve case
This setting applies only if Persist case sensitivity of role names is set to
true
.user.filter.with.email
boolean
false
Set this property to true if you only want to manage users who have an email address associated with them in the portal.
User.role.use.upper.case
boolean
false
Specifies whether PolicySync converts a user role name to uppercase when performing operations.
Group.role.use.upper.case
boolean
false
Specifies whether PolicySync converts a group name to uppercase when performing operations.
Role.role.use.upper.case
boolean
false
Specifies whether PolicySync converts a role name to uppercase when performing operations.
perform.grant.updates.batch
string
Specifies whether PolicySync applies grants and revokes in batches. If enabled, this behavior improves overall performance of applying permission changes.
perform.grant.updates.max.retry.attempts
integer
2
Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is
2
.enable.privileges.batching
boolean
false
Specifies whether PolicySync applies privileges described in Access Manager policies.
masking.policy.db.name
string
Specifies the name of the database where PolicySync creates custom masking policies.
masking.policy.schema.name
string
PUBLIC
Specifies the name of the schema where PolicySync creates all native masking policies. If not specified, the resource schema is used as the masking policy schema.
masking.policy.name.template
string
{database}{separator}{schema}{separator}{table}
Specifies a naming template that PolicySync uses when creating native masking policies. For example, given the following values:
{database}
:customer_db
{schema}
:customer_schema
{table}
:customer_data
{separator}
_priv_
With the default naming template, the following name is used when creating a native masking policy. The
{column}
field is replaced by the column name.customer_db_priv_customer_schema_priv_customer_data_{column}
row.filter.policy.db.name
string
Specifies the name of the database where PolicySync creates native row-filter policies. If not specified, the resource database is considered the same as the row-filter policy database.
row.filter.policy.schema.name
string
PUBLIC
Specifies the name of the schema where PolicySync creates all native row-filter policies. If not specified, the resource schema is considered the same as the row-filter policy schema.
row.filter.policy.name.template
string
{database}{separator}{schema}{separator}{table}
Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table
data
from theschema
schema that resides in thedb
database, the row filter policy name might resemble the following:db_priv_schema_priv_data_<ROW_FILTER_ITEM_NUMBER>
secure.view.schema.name.remove.suffix.list
string
Specifies a suffix to remove from a schema name. For example, if a schema is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
secure.view.name.remove.suffix.list
string
Specifies a suffix to remove from a table or view name. For example, if the table is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
secure.view.database.name.prefix
string
Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.
For example, if the prefix is
priv_
, then the secure view name for a database namedexample1
ispriv_example1
.secure.view.database.name.postfix
string
Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.
For example, if the postfix is
_sec
, then the secure view name for a database namedexample1
isexample1_sec
.secure.view.database.name.remove.suffix.list
string
Specifies a suffix to remove from a database name. For example, if the database is named
example_suffix
you can remove the_suffix
string. This transformation is applied before any custom prefix or postfix is applied.You can specify a single suffix or a comma separated list of suffixes.
policy.name.separator
string
_PRIV_
Specifies a string to use as part of the name of native row filter and masking policies.
row.filter.alias.token
string
obj
Specifies an identifier that PolicySync uses to identify columns from the main table and parse each correctly.
masked.double.value
integer
0
Specifies the default masking value for
DOUBLE
column types.masked.date.value
string
Specifies the default masking value for date column types.
peg.functions.db.name
string
Specifies the name of the database where the PEG encryption functions reside.
peg.functions.schema.name
string
public
Specifies the schema name where the PEG encryption functions reside.
load.roles
string
load_md
Specifies the method that PolicySync uses to load roles from Snowflake. The following methods are supported:
load_md
: Use metadata queriesload.users
string
load_md
Specifies how PolicySync loads users from Snowflake. The following values are valid:
load
load_db
load.resources
string
load_md_from_account_columns
Specifies how PolicySync loads resources from Snowflake. The following values are allowed:
load_md
: Load the resources using metadata queries.load_md_from_account_columns
: Load resources by directly runningSHOW QUERIES
on the account. This mode is preferred when you want to manage an entire Snowflake account.load_md_from_database_columns
: Load the resources by directly runningSHOW QUERIES
only on managed databases. This mode is preferred when you want to manage only a few databases.
load.policies
string
Specifies the method that PolicySync uses to load existing grants from Snowflake. The following methods are supported:
load_md
: Use metadata queriesload.audits
string
Specifies the method that PolicySync uses to load access audit information.
The following values are valid:
load
: Use SQL queries The following values are valid:
audit.enable.resource.filter
boolean
Specifies whether PolicySync filters access audit information by managed resources, such as databases, schemas, and so forth.
audit.initial.pull.min
string
30
Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Snowflake.
custom.audit.db.name
string
PRIVACERA_ACCESS_LOGS_DB
Specifies the database that PolicySync retrieves access audits from. This setting applies only if you set Enable advance audits to
true
.sync.interval.sec
integer
60
Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.
sync.serviceuser.interval.sec
integer
420
Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.
sync.servicepolicy.interval.sec
integer
60
Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.
audit.interval.sec
integer
30
Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.
jdbc.application
string
Specifies the name of a partner application to connect to through JDBC. This setting is for Snowflake partner use only.
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Object permission mapping
For more information about object permission mapping , see Snowflake Documentation.
Object | Supported Permissions | Description |
---|---|---|
Global | CreateWarehouse CreateDatabase | Enables creating a new virtual warehouse. Enables creating a new database in the system. |
Warehouse | UseWarehouse Operate Monitor Modify | Enables using a virtual warehouse and, as a result, executing queries on the warehouse. Enables changing the state of a warehouse (stop, start, suspend, resume). Enables viewing current and past queries executed on a warehouse as well as usage statistics on that warehouse. Enables altering any properties of a warehouse, including changing its size |
Database | UseDB CreateSchema | Enables using a database, including returning the database details in the SHOW DATABASES command output. Enables creating a new schema in a database, including cloning a schema. |
Schema | UseSchema CreateTable CreateProcedure CreateFunction CreateStream CreateSequence CreateFileFormat CreateStage CreatePipe CreateExternalTable | Enables using a schema, including returning the schema details in the SHOW SCHEMAS command output. Enables creating a new table in a schema, including cloning a table. Enables creating a new stored procedure in a schema. Enables creating a new UDF or external function in a schema. Enables creating a new stream in a schema, including cloning a stream. Enables creating a new sequence in a schema, including cloning a sequence. Enables creating a new file format in a schema, including cloning a file format. Enables creating a new stage in a schema, including cloning a stage. Enables creating a new pipe in a schema. Enables creating a new external table in a schema. |
Table | Select Insert Update Delete Truncate References | Enables executing a SELECT statement on a table. Enables executing an INSERT command on a table .Enables executing an UPDATE command on a table. Enables executing a DELETE command on a table. Enables executing a TRUNCATE TABLE command on a table. Enables referencing a table as the unique/primary key table for a foreign key constraint. |
View | Select | Enables executing a SELECT statement on a view. |
Procedure | Usage | Enables calling a stored procedure. |
Function | Usage | Enables calling a function. |
Stream | Select | Enables executing a SELECT statement on a stream. |
File_format | Usage | Enables using a file format in a SQL statement. |
Sequence | Usage | Enables using a sequence in a SQL statement. |
Internal_stage | Read Write | Enables performing any operations that require reading from an internal stage (GET, LIST, COPY INTO <table>); Enables performing any operations that require writing to an internal stage (PUT, REMOVE, COPY INTO <location>); |
External_stage | Usage | Enables using an external stage object in a SQL statement; |
Pipe | Operate Monitor | Enables viewing details for the pipe (using DESCRIBE PIPE or SHOW PIPES), pausing or resuming the pipe, and refreshing the pipe. Enables viewing details for the pipe (using DESCRIBE PIPE or SHOW PIPES). |
Enable Data Discovery
Click the toggle button to enable the Data Discovery for your application.
On the BASIC tab, enter values in the following fields.
JDBC URL
JDBC Username
JDBC Password
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
Add Data Source
To add a resources using this connection as Privacera Discovery targets, see Discovery Scan Targets.
Starburst Enterprise with PrivaceraCloud
PrivaceraCloud can provide system-wide access control across all data exposed in Starburst Enterprise.
Both privacera_hive and privacera_starburstenterprise resource policies can be used to integrate with both Starburst managed sources as well as 3rd party sources (such as Databricks) to maintain policy consistency.
Note
A common implementation pattern for data sources not directly supported in Privacera is to use Starburst Enterprise as a point of access policy enforcement. Create a layer of views in Starburst Enterprise on top of the unsupported source, apply Privacera access control policies to those views, and then limit most access to the source outside of Starburst Enterprise.
Starburst Enterprise is often deployed using a pre-built Docker image provided by Starburst. Using a Docker image for testing and single node deployment can be significantly faster than working with either RPM or tarball deployments. The instructions here describe the container-based deployment but other environments are similar. The following information explains how to configure Starburst Enterprise with port 8443 for TLS/HTTPS so that usernames/passwords are possible.
Note
PrivaceraCloud is a managed service so there is currently no option for connecting to a secured Starburst Enterprise instance that utilizes self-signed SSL certificates. The reason for this is because self-signed certificates are not chained to a publicly authenticated root certificate authority ("ca-root").
Prerequisites
The following items need to be enabled or shared prior to deploying a Starburst Docker image:
A licensed version of Starburst.
Docker-ce 18+ must be installed.
JDK 11 to generate the Java keystore.
JDBC URL to connect to the Starburst Enterprise instance, including catalog and schema. Unless you specify a catalog name, the JDBC connection is validated only to the host level.
CA-signed SSL certificate for production deployment.
Your PrivaceraCloud API Key.
Configure Privacera plug-in with Starburst Enterprise
Note
The Docker image already includes the Privacera plug-in needed for policy enforcement. Your tasks will be to create or update a number of configuration files in the container.
Summary of steps:
Generate access-control file(s) for Starburst (required) and for Hive catalogs (optional).
Generate a Ranger Audit XML file.
Generate a Ranger SSL XML file.
Generate a PrivaceraCloud JCEKS file.
To enable Privacera for authorization, you need to update the etc/config.properties file with one of the following entries:
prop # privacera auth for hive and system access control access-control.config-files=/etc/starburst/access-control-privacera.properties,/etc/starburst/access-control-priv-hive.properties
Or
prop # privacera auth for only system access control access-control.config-files=/etc/starburst/access-control-privacera.properties
The example below depends on your individual PrivaceraCloud API Key, which you must insert in three places below.
etc/ranger-hive-audit.xml
<?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>ranger.plugin.hive.service.name</name> <value>privacera_hive</value> </property> <property> <name>ranger.plugin.hive.policy.pollIntervalMs</name> <value>5000</value> </property> <property> <name>ranger.service.store.rest.url</name> <value> https://<YOUR_PRIVACERACLOUD_API_URL>/<API_KEY> </value> </property> <property> <name>ranger.plugin.hive.policy.rest.url</name> <value> https://<YOUR_PRIVACERACLOUD_API_URL>/<API_KEY> </value> </property> <property> <name>ranger.plugin.hive.policy.source.impl</name> <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> <description> Class to retrieve policies from the source </description> </property> <property> <name>ranger.plugin.hive.policy.rest.ssl.config.file</name> <value>/etc/starburst/ranger-policymgr-ssl.xml</value> <description> Path to the file containing SSL details to contact Ranger Admin </description> </property> <property> <name>ranger.service.store.rest.ssl.config.file</name> <value>/etc/starburst/ranger-policymgr-ssl.xml</value> </property> <property> <name>ranger.plugin.hive.policy.cache.dir</name> <value>/etc/starburst/tmp/ranger</value> <description> Directory where Ranger policies are cached after successful retrieval from the source </description> </property> <property> <name>ranger.plugin.starburst-enterprise-presto.policy.cache.dir</name> <value>/etc/starburst/tmp/ranger</value> <description> Directory where Ranger policies are cached after successful retrieval from the source </description> </property> <property> <name>xasecure.audit.destination.solr</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.solr.batch.filespool.dir</name> <value>/etc/starburst/tmp/solr</value> </property> <property> <name>xasecure.audit.destination.solr.urls</name> <value> https://<YOUR_PRIVACERACLOUD_API_URL>/<API_KEY>/solr/ranger_audits </value> </property> <property> <name>xasecure.audit.is.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.solr.is.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.solr.async.max.queue.size</name> <value>1</value> </property> <property> <name>xasecure.audit.solr.async.max.flush.interval.ms</name> <value>1000</value> </property> </configuration>
To install this file into the Docker container, you can add an option to your container creation script:
-v $DOCKER_HOME/$STARBURST_VERSION/etc/ranger-hive-audit.xml:$STARBURST_TGT/ranger-hive-audit.xml
The Ranger SSL XML file is needed when using PrivaceraCloud. This is because it uses API keys and the location of the JDK inside the Starburst containers might be different than other installations.
After the change from starburst-presto to starburst-trino releases, the JDK installation was updated by the Starburst engineering team.
Note
The <value> tags that follow should be verified periodically or based on best practices from Starburst engineering or partner teams.
etc/ranger-policymgr-ssl.xml
<?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>xasecure.policymgr.clientssl.truststore</name> <value>/etc/pki/java/cacerts</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.password</name> <value>crypted</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.credential.file</name> <value>jceks://file/etc/starburst/privaceracloud.jceks</value> </property> </configuration>
To install this file into the Docker container, you can add an option to your container creation script:
-v $DOCKER_HOME/$STARBURST_VERSION/etc/ranger-policymgr-ssl.xml:$STARBURST_TGT/ranger-polcymgr-ssl.xml
Edit etc/privaceracloud.jceks.
This file is for an encrypted password for reading or accessing the Java CACerts inside the Starburst containers.
If you are generating a JCEKS with the default Java Truststore password (“changeit”), you can use an existing Hive Metastore environment or an Hadoop distribution that is running Java 8 or newer.
Example CLI:
hadoop credential create sslTrustStore -value changeit -provider localjceks://file/var/tmp/privaceracloud.jceks
To install this file into the Docker container, you can add an option to your container creation script:
-v $DOCKER_HOME/$STARBURST_VERSION/etc/privaceracloud.jceks:$STARBURST_TGT/privaceracloud.jceks
Connect Starburst Enterprise application
Use the following steps to connect Starburst Enterprise application to the PrivaceraCloud for Privacera Access Management.
Login to PrivaceraCloud.
Go to Settings > Applications.
On the Applications screen, select Starburst Enterprise.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable the Privacera Access Management for Starburst Enterprise.
You will see this message, Save the setting to start controlling access on Starburst Enterprise.
Click Save.
Starburst Enterprise Presto
Starburst Enterprise Presto
Starburst Enterprise platform (SEP) is a commercial distribution of PrestoSQL. It includes additional security features, more connectors, and a cost-based query optimizer not available in the open source version.
As with open source PrestoSQL, SEP is designed to support an external Apache Ranger. This can be configured in the following independent ways:
System-level: Configure SEP so that resource policies defined in PrivaceraCloud under the
privacera_starburstenterprisepresto
resource service control access to Starburst resources.System-Plus-Hive: Configure SEP so that resource policies defined in PrivaceraCloud under both the privacera_starburstenterprisepresto and privacera_hive resource services control access to Starburst resources;
This configuration requires two additional configuration files.
Create a SEP service user
Create a service-user identity that will be used authenticate to your PrivaceraCloud account from the SEP
Go to Access Manager > Users/Groups/Roles, and then create a user. Record the user name. This will be referred to as "${RANGER_API_USERNAME}" in the SEP configuration steps.
Set the Role to Admin and record the password. This will be referred to as "${RANGER_API_PSWD}" in the SEP configuration steps.
Get the account specific API URL
Go to Settings > API Key, and then click GENERATE API KEY.
In the Generate Api Key dialog, set the purpose to REST API Access or similar, and then click the Never Expires check box.
Click the GENERATE API KEY* button.
Click the Copy Url button, and then click Close. Paste and store the URL value. This will be referred to as variable "${RANGER_URL} in the steps that follow.
The API Key page will display the added Api Key.
The Ranger Admin URL (${RANGER_URL}) will look similar to:
https://api.privaceracloud.com/api/13afxxxxxx6b981fxxxxxx2dc7cdd7xxxxxxa921636xxxxxx2d189d425b5f01
A full URL Ranger API service URI is:
<RangerAdminURL>/service/<Ranger API Resource Path>
.
Connect application
Use the following steps to connect Starburst Enterprise Presto application to the PrivaceraCloud for Privacera Access Management.
Go to Settings > Applications.
On the Applications screen, select StarburstStarburst Presto.
Enter the application Name and Description, and then click Save.
Click the toggle button to enable the Privacera Access Management for Starburst Enterprise Presto.
You will see this message: Save the setting to start controlling access on Starburst Enterprise Presto.
Click Save.
The starburst-enterprise-presto service will be available in the Access Manager > Resource Policies section.
Configure Starburst Enterprise (SEP) to use your Account PrivaceraCloud Ranger
SSH to the Hadoop cluster.
Use the following sequence of commands, and using wget, download and extract
starburst presto v350 jar
.mkdir downloads cd downloads wget https://s3.us-east-2.amazonaws.com/software.starburstdata.net/350e/350-e.3/starburst-presto-server-350-e.3.tar.gz -O presto-server.tar.gz presto-server.tar.gz tar zxvf presto-server.tar.gz mv presto-server-350-e.3 presto-server cd presto-server
Create a folder etc in which you need to create files and edit them to add the necessary properties.
mkdir etc cd etc/
Create an SSL truststore to communicate with Apache Ranger. The
chmod
command is used to change permission of theranger.jceks
file.hadoop credential create sslTrustStore -value changeit -provider localjceks://file/home/hadoop/downloads/presto-server/etc/ranger.jceks chmod a+r /home/hadoop/downloads/presto-server/etc/ranger.jceks
Create a catalog directory in which you need to create a
hive.properties
, so that you can use hive as a catalog for query.mkdir catalog
Change the default java interpreter on your cluster. By default it will be set to java 8, change it to java 11.
sudo update-alternatives --config java ---- Select one with Java 11
Now, in the folder etc, you can start configuring properties.
All the following files must be configured:
File | Standard location | Use |
---|---|---|
hive.properties | etc/catalog | Global Hive properties |
config.properties | etc | Points to plugin configuration files |
access-control-privacera.properties | etc | Values for Privacera access control |
ranger-policymgr-ssl.xml | etc | Values for Ranger Policy Manager |
ranger-hive-audit.xml | etc | Values for Ranger Hive and Audit |
access-control-priv-hive.properties | etc | Values for Hive Policies (used only for "System-Plus-Hive" configuration) |
Create the
hive.properties
file.a. Use the following command to create
hive.properties
file in the${PRESTO_CONFIG_PATH}/etc/catalog/
folder:vi hive.properties
b. Add the following content and save this file.
hive.metastore=glue hive.security=allow-all
Create an
access-control-privacera.properties
file.a. Use the following command to create
access-control-privacera.properties
in the${PRESTO_CONFIG_PATH}/etc/
folder:vi access-control-privacera.properties
b. Add the following content, substituting the values for
${RANGER_URL}
, ${RANGER_API_USERNAME}
, and${RANGER_API_PSWD}
, as they are referenced in the text below.Substitute values for
${PRESTO_CONFIG_PATH}
and${PRESTO_TEMP_DIRECTORY}
should correct for your environment.access-control.name=privacera-starburst ranger.policy-rest-url=https://${RANGER_URL} ranger.service-name=privacera_starburstenterprisepresto ranger.presto-plugin-username=${Ranger API username} ranger.presto-plugin-password=${Ranger API user password} ranger.policy-refresh-interval=3s # Example: ranger.config-resources=/usr/presto-server-341-e/etc/ranger-hive-audit.xml ranger.config-resources=${PRESTO_CONFIG_PATH}/etc/ranger-hive-audit.xml # Example: ranger.policy-cache-dir=/tmp/ranger ranger.policy-cache-dir=${PRESTO_TEMP_DIRECTORY} ranger.plugin-policy-ssl-config-file=${PRESTO_CONFIG_PATH}/etc/ranger-policymgr-ssl.xml
c. Save this file.
Create a
ranger-policymgr-ssl.xml
file.a. Use the following command to create
ranger-policymgr-ssl.xml
file in the${PRESTO_CONFIG_PATH}/etc/
folder.vi ranger-policymgr-ssl.xml
b. Add the following XML tags:
```xml <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>xasecure.policymgr.clientssl.truststore</name> <value>${JAVA_PATH}/lib/security/cacerts</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.password</name> <value>crypted</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.credential.file</name> <value>jceks://file/home/hadoop/downloads/presto-server/etc/ranger.jceks</value> </property> </configuration> ```
Create
ranger-hive-audit.xml
file.a. Use the following command to create
ranger-hive-audit.xml
file in the${PRESTO_CONFIG_PATH}/etc/
folder.vi ranger-hive-audit.xml
b. Add the following XML tags and substitute
${RANGER_URL}
where used.```xml <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>ranger.plugin.hive.service.name</name> <value>privacera_hive</value> </property> <property> <name>ranger.plugin.hive.policy.pollIntervalMs</name> <value>5000</value> </property> <property> <name>ranger.service.store.rest.url</name> <value> https://${RANGER_URL} </value> </property> <property> <name>ranger.plugin.hive.policy.rest.url</name> <value> https://${RANGER_URL} </value> </property> <property> <name>ranger.plugin.hive.policy.source.impl</name> <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> <description> Class to retrieve policies from the source </description> </property> <property> <name>ranger.plugin.hive.policy.rest.ssl.config.file</name> <value>/home/hadoop/downloads/presto-server/etc/ranger-policymgr-ssl.xml</value> <description> Path to the file containing SSL details to contact Ranger Admin </description> </property> <property> <name>ranger.service.store.rest.ssl.config.file</name> <value>/home/hadoop/downloads/presto-server/etc/ranger-policymgr-ssl.xml</value> </property> <property> <name>ranger.plugin.hive.policy.cache.dir</name> <value>/tmp/ranger</value> <description> Directory where Ranger policies are cached after successful retrieval from the source </description> </property> <property> <name>ranger.plugin.starburst-enterprise-presto.policy.cache.dir</name> <value>/tmp/ranger</value> <description> Directory where Ranger policies are cached after successful retrieval from the source </description> </property> <property> <name>xasecure.audit.destination.solr</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.solr.batch.filespool.dir</name> <value>presto temp file location</value> </property> <property> <name>xasecure.audit.destination.solr.urls</name> <value> https://${RANGER_AUDIT_URL} </value> </property> <property> <name>xasecure.audit.is.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.solr.is.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.solr.async.max.queue.size</name> <value>1</value> </property> <property> <name>xasecure.audit.solr.async.max.flush.interval.ms</name> <value>1000</value> </property> </configuration> ```
Create access-control-priv-hive.properties files.
a. Use the following command to create access-control-priv-hive.properties in the
${PRESTO_CONFIG_PATH}/etc/
folder:vi access-control-priv-hive.properties
b. Add the following content, substituting the values for
${RANGER_URL}
,${RANGER_API_USERNAME}
, and${RANGER_API_PSWD}
, as they are referenced in the text below.access-control.name=privacera ranger.policy-rest-url=https://${RANGER_URL} ranger.service-name=privacera_hive privacera.catalogs=hive ranger.presto-plugin-username=${RANGER_API_USERNAME} ranger.presto-plugin-password=${RANGER_API_PSWD} ranger.policy-refresh-interval=3s # Example: ranger.config-resources=/usr/presto-server-341-e/etc/ranger-hive-audit.xml ranger.config-resources={PRESTO_CONFIG_PATH}/etc/ranger-hive-audit.xml # Example: ranger.policy-cache-dir=/tmp/ranger ranger.policy-cache-dir=${PRESTO_TEMP_DIRECTORY} # Fallback allow-all allows privacera_starburst catalog-level permissions as fallback privacera.fallback-access-control=allow-all ranger.plugin-policy-ssl-config-file={PRESTO_CONFIG_PATH}/etc/ranger-policymgr-ssl.xml ranger.enable-row-filtering=true
If configuring for System-Level only, do not create this file, because you have already done the "System-Level" configuration in access-control-privacera.properties file.
Create
config.properties
file.a. Use the following command to create
config.properties
file in the${PRESTO_CONFIG_PATH}/etc/
folder:vi config.properties
If configuring for System-Level, add the following to this file:
access-control.config-files=etc/access-control-privacera.properties
If configuring for System-Plus-Hive, add the following to this file (note that this is a single line):
access-control.config-files=etc/access-control-privacera.properties,etc access-control-priv-hive.properties
Restart Starburst.
Trino
This topic describes how to connect the Trino application, obtain account-specific scripts from your PrivaceraCloud account, and configure the Trino plug-In.
Connect application
Go to Settings > Applications.
On the Applications screen, select Trino.
Enter the application Name and Description, and then click Save.
You can see Privacera Access Management and Data Discovery with toggle buttons.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery. For more information, see About Account
You only need to enable Privacera Access Management to start controlling access on Trino.
Click the toggle button to enable the Privacera Access Management for your application.
You will see this message, Save the setting to start controlling access on Trino.
Click Save.
Click the toggle button to enable Data Discovery for your application.
On the BASIC tab, enter values in the following fields.
JDBC URL -
jdbc:trino://<host>:<port>/<catalog>
The following three databases can be added as catalog on Trino server:
MySQL
Oracle
PostgreSQL
JDBC Username
JDBC Password
On the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Click the TEST CONNECTION button to check if the connection is successful, and then click Save.
To add a resources using this connection as Discovery targets, see Privacera Discovery scan targets.
Deploy Privacera plug-In in Trino
Obtain the account unique <privacera-plugin-script-download-url>
. This script and other commands run in your Trino command shell to complete the PrivaceraCloud installation.
Steps:
Go to Settings > API Key.
Use an existing Active API Key or generate a new one.
Click the info icon (i). The Api Key Info page appears.
On the Plugins Setup Script, click the COPY URL button. Save this value on your Trino server. It is needed as the
<privacera-plugin-script-download-url>
in the next step.
In the command shell on your Trino server, run the following commands:
export PLUGIN_TYPE="trino"
Configure Trino home folder.
export TRINO_HOME_FOLDER="/opt/privacera/trino-server" #saving privacera_plugin.sh wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
Change directory to where you saved
privacera_plugin.sh
chmod +x privacera_plugin.sh ./privacera_plugin.sh
This completes the installation.
Validate Installation
In PrivaceraCloud, open Access Manager > Audit, and click the PLUGIN tab. Look for audit items reporting Plugin Id for Trino and the status "Policies synced to plugin. This indicates that your Trino resource is connected.