- PrivaceraCloud Release 7.4
- Enhancements and updates in PrivaceraCloud release 7.4
- Known Issues in PrivaceraCloud 7.4
- PrivaceraCloud User Guide
- Overview of PrivaceraCloud
- Connect applications with the setup wizard
- Connect applications
- About applications
- Connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud
- Connect Amazon Textract to PrivaceraCloud
- Athena
- Privacera Discovery with Cassandra
- Connect Databricks to PrivaceraCloud
- Databricks SQL
- Databricks SQL Overview and Configuration
- Planning and general process
- Prerequisites
- Databricks SQL with Privacera Hive
- Connect Databricks SQL application
- Grant Databricks SQL permissions to PrivaceraCloud users
- Define a resource policy
- Test the policy
- Databricks SQL PolicySync fields
- Configuring column-level access control
- View-based masking functions and row-level filtering
- Create an endpoint in Databricks SQL
- Databricks SQL Fields
- Databricks SQL Hive Service Definition
- Databricks SQL Masking Functions
- Databricks SQL Encryption
- Use a custom policy repository with Databricks
- Connect Databricks SQL to Hive policy repository on PrivaceraCloud
- Databricks SQL Overview and Configuration
- Connect Databricks Unity Catalog to PrivaceraCloud
- Connect S3 to PrivaceraCloud
- Prerequisites in AWS console
- Connect S3 application to PrivaceraCloud
- Enable Privacera Access Management for S3
- Enable Data Discovery for S3
- S3 AWS Commands - Ranger Permission Mapping
- S3
- AWS Access with IAM
- Access AWS S3 buckets from multiple AWS accounts
- Add UserInfo in S3 Requests sent via Dataserver
- Control access to S3 buckets with AWS Lambda function on PrivaceraCloud
- Dremio Plugin
- DynamoDB
- Connect Elastic MapReduce from Amazon application to PrivaceraCloud
- Connect EMR application
- EMR Spark access control types
- PrivaceraCloud configuration
- AWS IAM roles using CloudFormation setup
- Create a security configuration
- Create EMR cluster
- How to configure multiple JSON Web Tokens (JWTs) for EMR
- EMR Native Ranger Integration with PrivaceraCloud
- Connect EMRFS S3 to PrivaceraCloud
- Files
- GBQ
- Google Cloud Storage
- Connect Glue to PrivaceraCloud
- Google BigQuery for PolicySync
- Connect Kinesis to PrivaceraCloud
- Connect Lambda to PrivaceraCloud
- Microsoft SQL Server
- MySQL for Discovery
- Open Source Apache Spark
- Oracle for Discovery
- PostgreSQL
- Connect Power BI to PrivaceraCloud
- Presto
- Redshift
- Snowflake
- Starburst Enterprise with PrivaceraCloud
- Starburst Enterprise Presto
- Trino
- Connect users
- Data access Users, Groups, and Roles
- UserSync
- Portal user LDAP/AD
- Datasource
- Okta Setup for SAML-SSO
- Azure AD setup
- SCIM Server User-Provisioning
- User Management
- Identity
- Access Manager
- Access Manager
- Resource Policies
- Tag Policies
- Scheme Policies
- Service Explorer
- Reports
- Audit
- About data access users, groups, and roles resource policies
- Security zones
- Discovery
- Classifications via random sampling
- Privacera Discovery scan targets
- Propagate Privacera Discovery Tags to Ranger
- Enable offline scanning on Azure Data Lake Storage Gen 2 (ADLS)
- Enable Real-time Scanning of S3 Buckets
- Enable Real-time Scanning on Azure Data Lake Storage Gen 2 (ADLS)
- Enable Discovery Realtime Scanning Using IAM Role
- Encryption
- Overview of Privacera Encryption
- Encryption schemes
- Presentation schemes
- Masking schemes
- Create scheme policies
- Privacera-supplied encryption schemes for the Privacera API
- Privacera-supplied encryption schemes for the Bouncy Castle API
- API date input formats
- Deprecated encryption formats, algorithms, and scopes
- Privacera Encryption REST API
- PEG API endpoint
- PEG REST API encryption endpoints
- Prerequisites
- Common PEG REST API fields
- Construct the datalist for the /protect endpoint
- Deconstruct the response from the /unprotect endpoint
- Example data transformation with the /unprotect endpoint and presentation scheme
- Example PEG API endpoints
- Make encryption API calls on behalf of another user
- Privacera Encryption UDF for masking in Databricks on PrivaceraCloud
- Privacera Encryption UDFs for Trino on PrivaceraCloud
- Syntax of Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera Crypto plug-in for Trino
- Download and install Privacera Crypto jar
- Set variables in Trino etc/crypto.properties
- Restart Trino to register the Privacera encryption and masking UDFs for Trino
- Example queries to verify Privacera-supplied UDFs
- Privacera Encryption UDF for masking in Trino on PrivaceraCloud
- Encryption UDFs for Apache Spark on PrivaceraCloud
- Launch Pad
- Settings
- Dashboard
- Usage statistics
- Operational status of PrivaceraCloud and RSS feed
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- PrivaceraCloud Previews
- Preview: File Explorer for S3
- Preview: File Explorer for Azure
- Preview: File Explorer for GCS
- Preview: Scan Generic Records with NER Model
- Preview: Scan Electronic Health Records with NER Model
- Preview: OneLogin setup for SAML-SSO
- Preview: Azure Active Directory SCIM Server UserSync
- Preview: OneLogin UserSync
- Preview: PingFederate UserSync
- Quickstart for Databricks Unity Catalog on PrivaceraCloud
- What do I need to do in my Databricks Workspace?
- Where is the sample dataset in my Databricks Workspace?
- What should I do in the PrivaceraCloud web portal?
- Access use-case - How do I give a user access to a table or restrict from running a SQL select query?
- Access use-case - How do I restrict a user from seeing contents of a column in the result of a SQL select query?
- Column masking use-case - How do I restrict a user from seeing contents of a column by masking the values in the result of a SQL select query?
- Access use-case - How do I disallow a user from seeing certain rows of a table?
- PrivaceraCloud documentation changelog
Databricks SQL Overview and Configuration
One purpose of PolicySync for Databricks SQL is to limit users access to your entire Databricks data source or portions thereof, such as Delta external tables, views, entire tables, or only certain columns or rows.
Planning and general process
The general process for connecting with JDBC to a Databricks SQL data source, creating policy, and limiting user access is as follows, You should plan to have the necessary information before you begin the specific steps described here.
Add the privacera_tag service.
Create an endpoint in Databricks SQL for PrivaceraCloud to connect to, with JDBC username, password, and URL.
Add Databricks SQL as a service in PrivaceraCloud.
Define a data source for the Databricks SQL endpoint in PrivaceraCloud using the values from the first step and other required fields.
Define the Databricks SQL service.
Determine the users, groups, or roles who need access from PrivaceraCloud to your Databricks SQL.
Ensure that all users in PrivaceraCloud who will access Databricks SQL have an email address in their PrivaceraCloud account.
Define those users with appropriate permissions in Databricks.
Create a resource policy to assign users, groups, or roles the necessary permissions to access the Databricks SQL data source at the appropriate depth.
Decide the depth of the data access you will give to users: views, source tables, columns, or rows. See Allowable Privileges.
Prerequisites
Make sure the Privacera Tag Service and Databricks SQL Endpoint configuration are updated before you configure Databricks SQL PolicySync.
Enable PrivaceraCloud tag service
In PrivaceraCloud, the administrator must add the privacera_tag service to enable PolicySync with Databricks SQL.
See the steps in Add the privacera_tag Service.
Create endpoint in Databricks SQL
In Databricks SQL, an administrator must create a Databricks SQL endpoint for connecting from PrivaceraCloud. This process is described in Create an Endpoint in Databricks SQL.
Make note of the following values for entering into the fields in PrivaceraCloud as detailed in Connect Application and Databricks SQL PolicySync Fields:
The email address of the user defined in the endpoint. This is the value of the JDBC username (Service jdbc username) in PrivaceraCloud.
The Databricks generated access token. This is the value of the JDBC password (Service jdbc password) for the defined JDBC username in PrivaceraCloud.
The JDBC URL (Service jdbc url) defined for the endpoint.
Databricks SQL with Privacera Hive
To use Databricks SQL with Privacera Hive, see Databricks SQL Hive Service Def.
Connect Databricks SQL application
With the values for the JDBC username, JDBC password, and JDBC URL that you noted in Create endpoint in Databricks SQL , define the data source connection in PrivaceraCloud to the Databricks SQL endpoint.
Follow these steps to connect the Databricks SQL application to the PrivaceraCloud:
Go the Setting > Applications.
In the Applications screen, select Databricks SQL.
Select the platform type (AWS or Azure) on which you want to configure the Databricks application.
Enter the application Name and Description, and then click Save.
Click the toggle button either to enable the Access Management or Data Discovery for Databricks SQL.
Note
If you don't see Data Discovery in your application, enable it in Settings > Account > Discovery.
In the BASIC tab, enter values in the fields. For more information on the Fields and it's values, see Databricks SQL PolicySync Fields.
Click Save.
In the ADVANCED tab, you can add custom properties.
Using the IMPORT PROPERTIES button, you can browse and import application properties.
Grant Databricks SQL permissions to PrivaceraCloud users
For each PrivaceraCloud user that needs access to Databricks SQL, the administrator needs to define that user with appropriate access permissions in Databricks.
Ensure all PrivaceraCloud users have an email address
All PrivaceraCloud users who will access Databricks SQL must have an email address in their user account on PrivaceraCloud. This email address is required to login to Databricks SQL.
Grant Databricks SQL access
In your Databricks account, navigate to Data science and engineering.
Click Workspace on the top right.
To open the Admin Console, go to the top right of the Workspace, click the user account icon, and select Admin Console.
In the Databricks SQL access column, select the checkbox for the user.
Grant Databricks SQL endpoint access
In the Databricks SQL Dashboard, navigate to SQL > Endpoints
Click the name of the Endpoint for which you want to add user permission.
In the top right, click Permissions.
In the SQL Endpoint Permissions dialog, select the intended user from drop down
Give the user Can Use permission.
Click Add.
Click Save.
Define a resource policy
In PrivaceraCloud, define a resource policy to grant access to the Databricks SQL data source to users, groups, or roles.
Follow the steps in Resource Policies and the details about allowed privileges described here.
Allowable privileges
The following privileges can be specified for a Databricks SQL resource policy:
SELECT: Allows read access to an object.
CREATE: Provides ability to create an object (for example, a table in a database).
MODIFY: Provides ability to add, delete, and modify data to or from an object.
USAGE: An additional requirement to perform any action on a database object.
READ_METADATA: Provides ability to view an object and its metadata.
CREATE_NAMED_FUNCTION: Provides ability to create a named UDF in an existing catalog or database.
ALL PRIVILEGES: Gives all privileges, equivalent to all the above privileges.
Data_Admin Privilege for Secure Views: With the Data_Admin privilege, access policies are applied to source tables. If you want to restrict the access policies only to the views and not to the source tables, enable the following property in the PolicySync configuration, as detailed in Connect Application and Databricks SQL PolicySync Fields:
Secure view Access by Table policies:
true
Test the policy
To assign privileges to users, groups, or roles, follow the steps in Resource Policies.
This can be tested with a non-administrator user.
Databricks SQL PolicySync fields
For a description of all fields that must or can be set for resource policy, see Databricks SQL PolicySync Fields.
Configuring column-level access control
To enable column-level access control, set the following fields when you define the PolicySync fields:
Column Level Access Control:
true
.In custom fields, add the following, where
# REDACTED #
is any string of your choice:ranger.policysync.connector.4.access.control.number.value=0 ranger.policysync.connector.4.access.control.double.value=0 ranger.policysync.connector.4.access.control.text.value='# REDACTED #'
View-based masking functions and row-level filtering
For supported masking functions and supported row-level filtering, see Databricks SQL Masking Functions.