Skip to main content

Privacera Documentation

Databricks Unity Catalog Tutorial

Follow this short tutorial to try out some Access Control features.

1: Table Level Access Control

The access policies for Table level access control are enabled by default. You can review the policy by going to PrivaceraCloud and checking the following policy:

Sales Data All Access - <your catalog>-privacera_sales_schema_<date_timestamp>

In the policy, you will see that the Catalog is set to your catalog, the schema is set to privacera_sales_schema_<date_timestamp> the table is set to sales_data , and the column is set to * (asterisk). This means that the policy should apply to all the columns of that table.

In the Allow Conditions section of the policy, you will see your email address in the Select User section.

The Enabled toggle should be set besides the Policy Name.

Now run the following query in your Databricks workspace (note that you have to run the query on the secure view):

select * from <catalog>.privacera_sales_schema_<date_timestamp>_secure.sales_table;

In the Results, you will see all the rows in the table and all the columns such as id, country, region, city, name and sales_amount.

Now, go back to the same policy in PrivaceraCloud, change the Enabled toggle besides the Policy Name and save the policy.

Re-run the same SQL query (against the secure view) in the Databricks workspace. You should get the following error:

User does not have USE SCHEMA on Schema `<catalog>privacera_sales_schema_<date_timestamp>_secure`

By modifying a policy in the PrivaceraCloud portal, we are able to restrict users from accessing a table. You no longer need to know the intricacy of SQL Grant/Revoke statements. In PrivaceraCloud policy, notice that we are use wildcard (*) to grant permissions on all the columns. Similarly we can use a wildcard in the name of the table - we can say all tables starting with the word sales would be part of this policy. This will allow you to create a policy even before a table is created in Databricks Unity Catalog. As soon as such a table is created, PrivaceraCloud will detect and apply the policy on it. Since all these policies are applied on Unity Catalog, the same policies will be applied on all the Databricks workspaces.

If you have more test users in Databricks workspace or a colleague, you can add them to the policy and check if they are able to access the sales_data secure view.

In a production deployment, you should use User Groups or Roles instead of assigning policies for an individual user.

2: Dynamic Column Level Access Control

We have already pre-created a policy to provide access to a limited number of columns. This policy is disabled by default. Enable the policy and rerun the query.

Sales Data Specific Columns - <your catalog>-privacera_sales_schema_<date_timestamp>

In the policy, you will see that the Catalog is set to your catalog, the schema is set to privacera_sales_schema_<date_timestamp> the table is set to sales_data , and Columns has a list of columns including id, city, region, country and sales_amount. The name column is not in the Columns.

In the Allow Conditions section of the policy, you will see your email address in the Select User section and Select in the Permissions section. This means you have Select permission on all columns except for the name column.

Select the Enabled toggle besides the Policy Name and save the policy.

Now run the following query in your Databricks workspace (note that you have to run the query on the secure view):

select * from <catalog>.privacera_sales_schema_<date_timestamp>_secure.sales_table;

In the Results, you will see all the rows in the table as well as all the columns, except that the name column will show NULL as values.

Keep the Sales Data All Access policy disabled.

3: Dynamic Column Masking

We have already pre-created a policy that masks out the columns based on the user who is running the query. The policy is disabled by default. You need to enable the policy and run the same query again.

Navigate to the Masking tab in navigate to the Masking tab in Resource policies to see the following policy, to see the following policy:

Anonymize city - <your catalog>-privacera_sales_schema_<date_timestamp>

In the policy, Catalog is set to your catalog, the schema is set to privacera_sales_schema_<date_timestamp>, the table is set to sales_data and the column is set to city. You will apply the dynamic column masking policy for this column.

Next, we're going to show the hashed (MD5) value of the email address column. Navigate to the Resource Policies > Masking tab.

Select the Enabled toggle besides the policy name and save the policy. Now, in secure view, run the following query in your Databricks workspace:

select * from <catalog>.privacera_sales_schema_<date_timestamp>_secure.sales_table

In the Results, you will see that the city column values are dynamically masked as MD5 instead of showing the actual value.

Note

Keep one of the Resource Policies (Sales Data All Access or Sales Data Specific Columns) enabled for this table because the masking policy is applied only if you have access to the table.

4. Dynamic Row Level Access Control

We created a policy which filters out rows the user is not allowed to see. The policy is disabled by default. Next, enable the policy and run the same query again.

Navigate to the Row Level Filter tab in Resource policies to see the following policy:

Sales by Country - <your catalog>-privacera_sales_schema_<date_timestamp>

In the Results, you can see the Catalog is set to your catalog, the schema is set to privacera_sales_schema_<date_timestamp> and the table is set to sales_data.

We want to see only those rows in the table for which the country column has a value of US. This is set in Row Level Conditions, where you will see your email address in the Select User action as Select under Permissions and the Row Level Filter set to country = 'US'

Select the Enabled toggle besides the policy name and save the policy. In the secure view, rrun the following query in your Databricks workspace:

select * from <catalog>.privacera_sales_schema_<date_timestamp>_secure.sales_table

In the Results, you will see only those rows for which country column has the value of US.

Note

Keep one of the Resource Policies (Sales Data All Access or Sales Data Specific Columns) enabled for this table because the Row Level Filter policy is applied only if you have access to the table.