Skip to content

Databricks SQL

This topic shows how to configure access control in Databricks SQL.

Prerequisites

Ensure the following prerequisite is met:

  • Create an endpoint in Databricks SQL with a user having admin privileges. For more information, refer to Create an endpoint in Databricks SQL.

  • As you configure the endpoint using the link provided above, get the following values:

    • Host URL
    • JDBC URL
    • SQL endpoint token
    • Database List

CLI Configuration

  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager/config
    cp sample-vars/vars.policysync.databricks.sql.analytics.yml custom-vars/
    vi config/custom-vars/vars.policysync.databricks.sql.analytics.yml
    
  3. Set the properties for your specific installation. For property details and description, see the Configuration Properties section that follows.

    Note

    Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, see Databricks SQL Connector.

  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    

Configuration Properties

JDBC configuration properties

Name Type Default Required Description
DATABRICKS_SQL_ANALYTICS_JDBC_URL string Yes

Specifies the JDBC URL for the Databricks SQL connector.

Use the following format for the JDBC URL:

jdbc:spark://<WORKSPACE_URL>:443/<DATABASE>;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/endpoints/1234567890

The workspace URL and the database name are derived from your Databricks SQL configuration.

DATABRICKS_SQL_ANALYTICS_JDBC_USERNAME string Yes

Specifies the JDBC username to use.

DATABRICKS_SQL_ANALYTICS_JDBC_PASSWORD string Yes

Specifies the access token of the SQL endpoint to use.

DATABRICKS_SQL_ANALYTICS_JDBC_DB string Yes

Specifies the name of the JDBC database to use.

DATABRICKS_SQL_ANALYTICS_OWNER_ROLE string No

Specifies the role that owns the resources managed by PolicySync.

  • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

  • If a value is specified, the owner of the resource will be changed to the specified value.

The following resource types are supported:

  • Database
  • Schemas
  • Tables
  • Views
DATABRICKS_SQL_ANALYTICS_HOST_URL string Yes

Specifies the base URL for the Databricks SQL instance.

Load keys and intervals

Name Type Default Required Description
DATABRICKS_SQL_ANALYTICS_LOAD_RESOURCES_KEY string load_like No

Specifies how PolicySync loads resources from Databricks SQL. The following values are allowed:

  • load_like: Default value for loading resources, to be used in production.

  • load: Load resources from Databricks SQL with a top-down resources approach, that is, it first loads the database and then the schemas followed by tables and its columns. This mode is only for development purposes.

DATABRICKS_SQL_ANALYTICS_RESOURCE_SYNC_INTERVAL integer 60 No

Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

DATABRICKS_SQL_ANALYTICS_PRINCIPAL_SYNC_INTERVAL integer 420 No

Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

DATABRICKS_SQL_ANALYTICS_PERMISSION_SYNC_INTERVAL integer 540 No

Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

DATABRICKS_SQL_ANALYTICS_AUDIT_SYNC_INTERVAL integer 30 No

Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.

Resources management

Name Type Default Required Description
DATABRICKS_SQL_ANALYTICS_MANAGE_DATABASE_LIST string No

Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards.

An example list of databases might resemble the following: testdb1,testdb2,sales db*.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_DATABASE_LIST takes precedence over this setting.

DATABRICKS_SQL_ANALYTICS_MANAGE_TABLE_LIST string No

Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards.

Use the following format when specifying a table:

<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_TABLE_LIST takes precedence over this setting.

If you specify a wildcard, such as in the following example, all matched tables are managed:

<DATABASE_NAME>.<SCHEMA_NAME>.*

The specified value, if any, is interpreted in the following ways:

  • If unset, access control is managed for all tables.
  • If set to none no tables are managed.
DATABRICKS_SQL_ANALYTICS_IGNORE_DATABASE_LIST string No

Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all databases are subject to access control.

For example:

testdb1,testdb2,sales_db*

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_DATABASE_LIST.

DATABRICKS_SQL_ANALYTICS_IGNORE_TABLE_LIST string No

Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Specify tables using the following format:

<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_TABLE_LIST.

Users/Groups/Roles management

Name Type Default Required Description
DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_FROM_REGEX string [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}] No

Specifies a regular expression to apply to a user name and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_TO_STRING string _ No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_FROM_REGEX string [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}] No

Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_TO_STRING string _ No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_FROM_REGEX string [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}] No

Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_TO_STRING string _ No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_FROM_REGEX string [~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}] No

Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_TO_STRING string _ No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_USER_NAME_PERSIST_CASE_SENSITIVITY boolean false No

Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_PERSIST_CASE_SENSITIVITY boolean false No

Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_PERSIST_CASE_SENSITIVITY boolean false No

Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to true, case sensitivity is preserved.

DATABRICKS_SQL_ANALYTICS_USER_NAME_CASE_CONVERSION string lower No

Specifies how user name conversions are performed. The following options are valid:

  • lower: Convert to lowercase
  • upper: Convert to uppercase
  • none: Preserve case

This setting applies only if DATABRICKS_SQL_ANALYTICS_USER_NAME_PERSIST_CASE_SENSITIVITY is set to true.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_CASE_CONVERSION string lower No

Specifies how group name conversions are performed. The following options are valid:

  • lower: Convert to lowercase
  • upper: Convert to uppercase
  • none: Preserve case

This setting applies only if DATABRICKS_SQL_ANALYTICS_GROUP_NAME_PERSIST_CASE_SENSITIVITY is set to true.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_CASE_CONVERSION string lower No

Specifies how role name conversions are performed. The following options are valid:

  • lower: Convert to lowercase
  • upper: Convert to uppercase
  • none: Preserve case

This setting applies only if DATABRICKS_SQL_ANALYTICS_ROLE_NAME_PERSIST_CASE_SENSITIVITY is set to true.

DATABRICKS_SQL_ANALYTICS_CREATE_USER boolean true No

Specifies whether PolicySync creates local users for each user in Privacera.

DATABRICKS_SQL_ANALYTICS_MANAGE_USERS boolean true No

Specifies whether PolicySync maintains user membership in roles in the Databricks SQL data source.

DATABRICKS_SQL_ANALYTICS_MANAGE_GROUPS boolean true No

Specifies whether PolicySync creates groups from Privacera in the Databricks SQL data source.

DATABRICKS_SQL_ANALYTICS_MANAGE_ROLES boolean true No

Specifies whether PolicySync creates roles from Privacera in the Databricks SQL data source.

DATABRICKS_SQL_ANALYTICS_MANAGE_USER_LIST string No

Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards.

If not specified, PolicySync manages access control for all users.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_USER_LIST takes precedence over this setting.

An example user list might resemble the following: user1,user2,dev_user*.

DATABRICKS_SQL_ANALYTICS_MANAGE_GROUP_LIST string No

Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards.

An example list of projects might resemble the following: group1,group2,dev_group*.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_GROUP_LIST takes precedence over this setting.

DATABRICKS_SQL_ANALYTICS_MANAGE_ROLE_LIST string No

Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards.

An example list of projects might resemble the following: role1,role2,dev_role*.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_ROLE_LIST takes precedence over this setting.

DATABRICKS_SQL_ANALYTICS_IGNORE_USER_LIST string No

Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all users are subject to access control.

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_USER_LIST.

DATABRICKS_SQL_ANALYTICS_IGNORE_GROUP_LIST string No

Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all groups are subject to access control.

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_GROUP_LIST.

DATABRICKS_SQL_ANALYTICS_IGNORE_ROLE_LIST string No

Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all roles are subject to access control.

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_ROLE_LIST.

DATABRICKS_SQL_ANALYTICS_GROUP_ROLE_PREFIX string priv_group_ No

Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

DATABRICKS_SQL_ANALYTICS_ROLE_ROLE_PREFIX string priv_role_ No

Specifies the prefix that PolicySync uses when creating roles from Privacera in the Databricks SQL data source.

For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

DATABRICKS_SQL_ANALYTICS_USE_NATIVE_PUBLIC_GROUP boolean true No

Specifies whether PolicySync uses the Databricks SQL native public group for access grants whenever a policy refers to a public group. The default value is true.

DATABRICKS_SQL_ANALYTICS_MANAGE_USER_FILTERBY_GROUP boolean false No

Specifies whether to manage only the users that are members of groups specified by DATABRICKS_SQL_ANALYTICS_MANAGE_GROUP_LIST. The default value is false.

DATABRICKS_SQL_ANALYTICS_MANAGE_USER_FILTERBY_ROLE boolean false No

Specifies whether to manage only users that are members of the roles specified by DATABRICKS_SQL_ANALYTICS_MANAGE_ROLE_LIST. The default value is false.

DATABRICKS_SQL_ANALYTICS_USER_USE_EMAIL_AS_SERVICE_NAME boolean true No

This Property is used to map the username to the email address when granting/revoking access.

Access control management

Name Type Default Required Description
DATABRICKS_SQL_ANALYTICS_ENABLE_VIEW_BASED_MASKING boolean true No

Specifies whether to use secure view based masking. The default value is true.

DATABRICKS_SQL_ANALYTICS_ENABLE_VIEW_BASED_ROW_FILTER boolean true No

Specifies whether to use secure view based row filtering. The default value is true.

While Databricks SQL supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_CREATE_FOR_ALL boolean true No

Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

DATABRICKS_SQL_ANALYTICS_MASKED_NUMBER_VALUE integer 0 No

Specifies the default masking value for numeric column types.

DATABRICKS_SQL_ANALYTICS_MASKED_TEXT_VALUE string <MASKED> No

Specifies the default masking value for text and string column types.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_NAME_PREFIX string No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_NAME_POSTFIX string No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_DATABASE_NAME_PREFIX string No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

For example, if the prefix is priv_, then the secure view name for a database named example1 is priv_example1.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_DATABASE_NAME_POSTFIX string _secure No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

For example, if the postfix is _sec, then the secure view name for a database named example1 is example1_sec.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_NAME_REMOVE_SUFFIX_LIST string No

Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_DATABASE_NAME_REMOVE_SUFFIX_LIST string No

Specifies a suffix to remove from a database name. For example, if the database is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

DATABRICKS_SQL_ANALYTICS_GRANT_UPDATES boolean true Yes

Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

DATABRICKS_SQL_ANALYTICS_ENABLE_DATA_ADMIN boolean true No

This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

DATABRICKS_SQL_ANALYTICS_USE_HIVE_ACCESS_POLICIES boolean false No

Uncomment and set this property to true to use the privacera_hive service instead of privacera_databricks_sql_analytics. After setting the Privacera Manager property, PolicySync starts syncing the policies from the privacera_hive service. Be aware that deny and exclude polices from privacera_hive will not work, as only allow policies from Hive are supported this way.

Access audits management

Name Type Default Required Description
DATABRICKS_SQL_ANALYTICS_AUDIT_ENABLE boolean true Yes

Specifies whether Privacera fetches access audit data from the data source.

DATABRICKS_SQL_ANALYTICS_AUDIT_INITIAL_PULL_MINUTES integer 30 No

Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Databricks SQL.

DATABRICKS_SQL_ANALYTICS_AUDIT_EXCLUDED_USERS string {{DATABRICKS_SQL_ANALYTICS_JDBC_USERNAME}} No

Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".