Privacera Platform master publication

Google BigQuery
:

Google BigQuery provides fine-grained access control on BigQuery Datasets. This includes

  • Table-level Access Control

  • Column-level Access Control

  • Native/Dynamic secure view-based Row Filter

  • Masking With Dynamic secure views created using PolicySync

Privacera access control for BigQuery relies on the privacera_bigquery connector service.

Prerequisites
Create PrivaceraPolicySyncRole IAM Role

You need to give Privacera PolicySync basic access to GCP. To grant that access, create PrivaceraPolicySyncRole IAM role in your GCP project or GCP organization using the following commands on Google Cloud's shell (gcloud). The shell can be installed and accessed locally or through Google Console.

Run the following command to create the file containing the permissions required for the PrivaceraPolicySyncRole role:

ROLE_NAME="PrivaceraPolicySyncRole"

cat << EOF > ${ROLE_NAME}.yaml
title: "${ROLE_NAME}"
description: "${ROLE_NAME}"
stage: "ALPHA"
includedPermissions:
- resourcemanager.projects.get
- resourcemanager.projects.getIamPolicy
- resourcemanager.projects.setIamPolicy
- iam.roles.list
- iam.roles.get
- iam.roles.create
- iam.roles.update
- bigquery.jobs.create
- bigquery.datasets.get
- bigquery.datasets.create
- bigquery.datasets.update
- bigquery.datasets.delete
- bigquery.datasets.getIamPolicy
- bigquery.datasets.setIamPolicy
- bigquery.tables.list
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.create
- bigquery.tables.update
- bigquery.tables.delete
- bigquery.tables.getIamPolicy
- bigquery.tables.setIamPolicy
- bigquery.rowAccessPolicies.list
- bigquery.rowAccessPolicies.create
- bigquery.rowAccessPolicies.update
- bigquery.rowAccessPolicies.delete
- bigquery.rowAccessPolicies.getIamPolicy
- bigquery.rowAccessPolicies.setIamPolicy

EOF
GCP Project-level access

Note

If you have multiple projects in your GCP organization and want them to be managed by a single BigQuery connector, then repeat the steps below for each project. Assign the role to the same service account which will be used across multiple projects.

  1. Run the following command. Replace <GCP_PROJECT_ID> with your GCP project ID.

    PROJECT_ID="<GCP_PROJECT_ID>"
              
  2. To create PrivaceraPolicySyncRole role in your GCP project, run the following command.

    gcloud iam roles create ${ROLE_NAME} --project=${PROJECT_ID} --file=${ROLE_NAME}.yaml         
GCP Organization-level access
  1. Run the following command. Replace <GCP_ORGANIZATION_ID> with your GCP organization ID.

    ORGANIZATION_ID="<GCP_ORGANIZATION_ID>"                                                                     
  2. To create PrivaceraPolicySyncRole role in your GCP organization, run the following command.

    gcloud iam roles create ${ROLE_NAME} --organization=${ORGANIZATION_ID} --file=${ROLE_NAME}.yaml
Attach IAM Role to Service Account

To attach the PrivaceraPolicySyncRole IAM role created above, do the following steps:

  1. Log in to your GCP console.

  2. Select IAM & admin > Service accounts and click + CREATE SERVICE ACCOUNT.

  3. Enter values in the fields and click CREATE.

  4. In Grant this service account access to project, select the role as PrivaceraPolicySyncRole.

  5. On the Services Account Page, find the newly created service account and copy the email address of the new service account for use in a later step.

    Note

    This email will be the Service Account Email for configuring PolicySync in Privacera Manager.

  6. If you are using a Google VM machine to configure GBQ for PolicySync, then you can attach the service account created above to your VM machine and skip below steps.

  7. On the Services Account Page, go to the Keys tab and click Add Key and select Create New Key.

  8. Select the JSON key type, and click CREATE. A JSON key file downloads to your system. Store the file at an accessible location. It will be used for configuring PolicySync in Privacera Manager.

Refer the Google documentation for a detailed information on creating a service account.

Configure Logs for Auditing

A sink is required to collect all the logs from GBQ. To create a sink, do the following steps:

  1. In the search bar, search for Logging, and then click Logs Router, and click Create Sink.

  2. Enter the sink name as PolicySyncBigQueryAuditSink, and then click Next.

  3. Enter the sink destination.

    1. In the Select sink service, select BigQuery.

    2. In SelectBigQuerydataset, click Create newBigQuerydataset.

    3. Enter the Dataset ID as bigquery_audits and click Create Dataset.

    4. Click Next.

  4. Add theBigQuerylogs in the sink:

    In the Build an inclusion filter, add the following line:

    resource.type="bigquery_resource"
  5. Click Create Sink.

Refer the Google documentation for a detailed information on creating a sink.

CLI Configuration
  1. SSH to the instance where Privacera is installed.

  2. Do the following, if you’re not using VM attached credentials:

    1. Access the JSON file of the service account you downloaded using the steps above.

    2. Copy the JSON to the config/custom-vars folder.

  3. Run the following commands.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.policysync.bigquery.yml config/custom-vars/
    vi config/custom-vars/vars.policysync.bigquery.yml
  4. Set the properties for your specific installation. For property details and descriptions, see the Configuration Properties section that follows.

    Notice

    Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, see PolicySync properties.

  5. Run the update.

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
Configuration Properties

JDBC configuration properties

Table 42. JDBC configuration properties

Name

Type

Default

Required

Description

BIGQUERY_PROJECT_LOCATION

string

us

Yes

Specifies the geographical region where the taxonomy for the PolicySync should be created.

BIGQUERY_PROJECT_ID

string

Yes

Specifies the Google project ID where your Google BigQuery data source resides. For example: privacera-demo-project.

BIGQUERY_JDBC_URL

string

jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443

No

Specifies the JDBC URL for the Google BigQuery connector.

BIGQUERY_USE_VM_CREDENTIALS

boolean

false

No

Specifies whether the PolicySync uses the service account attached to your virtual machine for the credentials to connect to the data source.

When set to true you do not need to specify a value for BIGQUERY_OAUTH_PRIVATE_KEY_PATH.

BIGQUERY_OAUTH_SERVICE_ACCOUNT_EMAIL

string

Yes

Specifies the service account email address that PolicySync uses. You must specify this value if you are not using a Google Cloud Platform (GCP) virtual machine attached service account.

BIGQUERY_OAUTH_PRIVATE_KEY_PATH

string

/workdir/policysync/cust_conf/policysync-gbq-service-account.json

Yes

Specifies the path of the service account credentials JSON file that you downloaded from your Google Cloud Platform (GCP) account. You must specify this property if BIGQUERY_USE_VM_CREDENTIALS is set to false.

BIGQUERY_OAUTH_PRIVATE_KEY_FILE_NAME

string

Yes

Specifies the name of the JSON file that contains your Google Cloud Platform service account credentials. If specified, this value is combined with BIGQUERY_OAUTH_PRIVATE_KEY_PATH to provide a full path to the credentials JSON file. If you specify BIGQUERY_OAUTH_PRIVATE_KEY_PATH then you must specify this value as well.



Custom IAM roles

Table 43. Custom IAM roles

Name

Type

Default

Required

Description

BIGQUERY_CREATE_CUSTOM_IAM_ROLES

boolean

true

No

Specifies whether PolicySync automatically creates custom IAM roles in your Google Cloud Platform project or organization for fine-grained access control (FGAC). If set to false, you must create all required custom IAM roles manually in your GCP project or organization. The default value is true.

BIGQUERY_CUSTOM_IAM_ROLES_SCOPE

string

project

No

Specifies whether PolicySync creates and uses custom IAM roles at the project or organizational level in Google Cloud Platform (GCP). The following values are allowed:

  • project: Create and use custom IAM roles from each individual project level.

  • org: Create and use custom IAM roles at the organizational level.

BIGQUERY_ORGANIZATION_ID

string

No

Specifies the Google Cloud Platform (GCP) organizational ID. Specify this only if you configured PolicySync to use custom IAM roles at the organizational level.

BIGQUERY_CUSTOM_IAM_ROLES_NAME_MAPPING

string

No

Specifies a list of mappings between PolicySync custom IAM role names and your custom role names. Use the following format when specifying your custom role names:

<PRIVACERA_DEFAULT_ROLE_NAME_1>:<CUSTOM_ROLE_NAME_1>
<PRIVACERA_DEFAULT_ROLE_NAME_2>:<CUSTOM_ROLE_NAME_2>

The following is a list of the default custom role names:

  • PrivaceraGBQProjectListRole

  • PrivaceraGBQJobListRole

  • PrivaceraGBQJobListAllRole

  • PrivaceraGBQJobCreateRole

  • PrivaceraGBQJobGetRole

  • PrivaceraGBQJobUpdateRole

  • PrivaceraGBQJobDeleteRole

  • PrivaceraGBQDatasetCreateRole

  • PrivaceraGBQDatasetGetMetadataRole

  • PrivaceraGBQDatasetUpdateRole

  • PrivaceraGBQDatasetDeleteRole

  • PrivaceraGBQTableListRole

  • PrivaceraGBQTableCreateRole

  • PrivaceraGBQTableGetMetadataRole

  • PrivaceraGBQTableQueryRole

  • PrivaceraGBQTableExportRole

  • PrivaceraGBQTableUpdateMetadataRole

  • PrivaceraGBQTableUpdateRole

  • PrivaceraGBQTableSetCategoryRole

  • PrivaceraGBQTableDeleteRole

  • PrivaceraGBQTransferUpdateRole

  • PrivaceraGBQTransferGetRole



Load keys and intervals

Table 44. Load keys and intervals

Name

Type

Default

Required

Description

BIGQUERY_LOAD_RESOURCES_KEY

string

load_from_dataset_columns

No

Specifies how PolicySync loads resources from Google BigQuery. The following values are allowed:

  • load_md: Load resources from Google BigQuery with a top-down resources approach, that is, it first loads the project and then the dataset followed by tables and its columns.

  • load_from_dataset_columns: Load resources one by one for each resource type that is, it loads all projects first, then it loads all datasets in all projects, followed by all tables in all datasets and its columns. This mode is recommended since it is faster than the load mode.

BIGQUERY_RESOURCE_SYNC_INTERVAL

integer

60

No

Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

BIGQUERY_PRINCIPAL_SYNC_INTERVAL

integer

420

No

Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

BIGQUERY_PERMISSION_SYNC_INTERVAL

integer

540

No

Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

BIGQUERY_AUDIT_SYNC_INTERVAL

integer

30

No

Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.



Resources management

Table 45. Resources management

Name

Type

Default

Required

Description

BIGQUERY_MANAGE_PROJECT_LIST

string

Yes

Specifies a comma-separated list of project names to which access control is managed by PolicySync. If unset, PolicySync manages all projects. If specified, use the following format. You can use wildcards. Names are case-sensitive.

The list of projects to ignore takes precedence over any projects specified by this setting.

An example list of projects might resemble the following: testproject1,testproject2,sales_project*.

BIGQUERY_MANAGE_DATASET_LIST

string

Yes

Specifies a list of comma-separated datasets that PolicySync manages access control to. You can use wildcards in the value. Names are case-sensitive. If you want to manage all datasets, do not set a value. For example:

testproject1.dataset1,testproject2.dataset2,sales_project*.sales*

You can configure the postfix by specifying BIGQUERY_SECURE_VIEW_DATASET_NAME_POSTFIX.

If specified, the BIGQUERY_IGNORE_DATASET_LIST setting takes precedence over this setting.

BIGQUERY_MANAGE_TABLE_LIST

string

No

Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards.

Use the following format when specifying a table:

<PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME>

If specified, BIGQUERY_IGNORE_TABLE_LIST takes precedence over this setting.

If you specify a wildcard, such as in the following example, all matched tables are managed:

<PROJECT_NAME>.<DATASET_NAME>.*

The specified value, if any, is interpreted in the following ways:

  • If unset, access control is managed for all datasets.

  • If set to none no datasets are managed.

BIGQUERY_IGNORE_PROJECT_LIST

string

No

Specifies a comma-separated list of project names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all projects are subject to access control.

For example: testproject1,testproject2,sales_project*.

This setting supersedes any values specified by BIGQUERY_MANAGE_PROJECT_LIST.

BIGQUERY_IGNORE_DATASET_LIST

string

No

Specifies a comma-separated list of dataset names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all datasets are subject to access control.

For example: testproject1.dataset1,testproject2.dataset2,sales_project*.sales*.

This setting supersedes any values specified by BIGQUERY_MANAGE_DATASET_LIST.

BIGQUERY_IGNORE_TABLE_LIST

string

No

Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all tables are subject to access control. Specify tables using the following format:

<PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME>

This setting supersedes any values specified by BIGQUERY_MANAGE_TABLE_LIST.



Users, groups, and roles management

Table 46. Users, groups, and roles management

Name

Type

Default

Required

Description

BIGQUERY_USER_NAME_REPLACE_FROM_REGEX

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the BIGQUERY_USER_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

BIGQUERY_USER_NAME_REPLACE_TO_STRING

string

_

No

Specifies a string to replace the characters matched by the regex specified by the BIGQUERY_USER_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

BIGQUERY_GROUP_NAME_REPLACE_FROM_REGEX

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the BIGQUERY_GROUP_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

BIGQUERY_GROUP_NAME_REPLACE_TO_STRING

string

_

No

Specifies a string to replace the characters matched by the regex specified by the BIGQUERY_GROUP_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

BIGQUERY_MANAGE_USER_LIST

string

No

Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

If not specified, PolicySync manages access control for all users.

If specified, BIGQUERY_IGNORE_USER_LIST takes precedence over this setting.

An example user list might resemble the following: user1,user2,dev_user*.

BIGQUERY_MANAGE_GROUP_LIST

string

No

Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: group1,group2,dev_group*.

If specified, BIGQUERY_IGNORE_GROUP_LIST takes precedence over this setting.

BIGQUERY_IGNORE_USER_LIST

string

No

Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

This setting supersedes any values specified by BIGQUERY_MANAGE_USER_LIST.

BIGQUERY_IGNORE_GROUP_LIST

string

No

Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

This setting supersedes any values specified by BIGQUERY_MANAGE_GROUP_LIST.

BIGQUERY_NATIVE_PUBLIC_GROUP_IDENTITY_NAME

string

Yes

Set this property to your preferred value, policysync uses this native public group for access grants whenever there is policy created referring to public group inside it. The following values are allowed:

  • ALL_AUTHENTICATED_USERS: All gcp project authenticated users.

  • ALL_USERS: All google authenticated users.

BIGQUERY_MANAGE_USER_FILTERBY_GROUP

boolean

false

No

Specifies whether to manage only the users that are members of groups specified by BIGQUERY_MANAGE_GROUP_LIST. The default value is false.



Access control management

Table 47. Access control management

Name

Type

Default

Required

Description

BIGQUERY_COLUMN_ACCESS_CONTROL_TYPE

string

view

No

Specifies how PolicySync manages column-level access control. The following values are allowed:

  • view: Use view-based column level access control. Any columns that a user cannot access appears as null in the secure view of the table or the secure view of the native view.

BIGQUERY_POLICY_NAME_SEPARATOR

string

_

No

Specifies a string to use as part of the name of native row filter and masking policies.

BIGQUERY_ROW_FILTER_POLICY_NAME_TEMPLATE

string

row_filter_item_

No

Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table data from the ds dataset that resides in the proj project, the row filter policy name might resemble the following:

proj_priv_ds_priv_data_<ROW_FILTER_ITEM_NUMBER>

BIGQUERY_ENABLE_ROW_FILTER

boolean

false

No

Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views.

BIGQUERY_ENABLE_VIEW_BASED_MASKING

boolean

true

No

Specifies whether to use secure view based masking. The default value is true.

BIGQUERY_ENABLE_VIEW_BASED_ROW_FILTER

boolean

true

No

Specifies whether to use secure view based row filtering. The default value is true.

While Google BigQuery supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

BIGQUERY_SECURE_VIEW_CREATE_FOR_ALL

boolean

true

No

Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.



Access audits management

Table 48. Access audits management

Name

Type

Default

Required

Description

BIGQUERY_MASKING_FUNCTIONS_DATASET

string

privacera_dataset

No

Specifies the name of the dataset where PolicySync creates custom masking functions.

BIGQUERY_MASKED_NUMBER_VALUE

integer

0

No

Specifies the masking value used for numeric data types.

BIGQUERY_MASKED_TEXT_VALUE

string

<MASKED>

No

Specifies the masking value used for text or string data types.

BIGQUERY_SECURE_VIEW_NAME_PREFIX

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

BIGQUERY_SECURE_VIEW_NAME_POSTFIX

string

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

BIGQUERY_SECURE_VIEW_DATASET_NAME_PREFIX

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a dataset named example1 is dev_example1.

BIGQUERY_SECURE_VIEW_DATASET_NAME_POSTFIX

string

_secure

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name.

If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a dataset named example1 is example1_dev.

BIGQUERY_SECURE_VIEW_NAME_REMOVE_SUFFIX_LIST

string

No

Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

BIGQUERY_SECURE_VIEW_DATASET_NAME_REMOVE_SUFFIX_LIST

string

No

Specifies a suffix to remove from a secure view dataset name. For example, if the dataset is named some_name_ds you can remove the _ds string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes, such as _raw,_qa,_prod.

BIGQUERY_AUTHORIZED_VIEW_ACL_UPDATER_INTERVAL

integer

10

No

Specifies the interval at which the authorized view ACLs updater thread updates the permissions in the dataset if any permission updates are pending.

BIGQUERY_GRANT_UPDATES

boolean

true

Yes

Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

BIGQUERY_GRANT_UPDATES_MAX_RETRY_ATTEMPTS

integer

2

No

Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is 2.

BIGQUERY_GRANT_UPDATES_BATCH

boolean

true

No

Specifies whether PolicySync applies grants and revokes in batches. If enabled, this behavior improves overall performance of applying permission changes.

BIGQUERY_ENABLE_DATA_ADMIN

boolean

true

No

This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

BIGQUERY_AUDIT_ENABLE

boolean

false

Yes

Specifies whether Privacera fetches access audit data from the data source.

BIGQUERY_AUDIT_EXCLUDED_USERS

string

No

Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".

BIGQUERY_AUDIT_PROJECT_ID

string

No

Specifies the project ID where Google BigQuery stores audit log data.

BIGQUERY_AUDIT_DATASET_NAME

string

No

Specifies the name of the dataset where Google BigQuery logs audit data. Privacera uses this data for running audit queries.

BIGQUERY_AUDIT_LOAD_MAX_INTERVAL_MINUTES

integer

30

No

Specifies the maximum interval, in minutes, of the time window that SQL queries use to retrieve access audit information. If there are a large number of audits records, narrowing the window interval improves performance.

For example, if the interval is set to 30, SQL queries similar to the following are executed:

SELECT * FROM audits where time_from=00:01 and time_to=00:30;
SELECT * FROM audits where time_from=00:31 and time_to=01:00;
SELECT * FROM audits where time_from=01:01 and time_to=01:30;



Validation

Let's test the access control usingBigQueryby defining some test policies for a test user (emily).

  1. Prepare test data in GCP. Refer Google documentation for a detailed information on running queries inBigQuery- Click here.

    1. Log in to GCP console, navigate to BigQuery and then click Compose New Query.

    2. In the Query editor text area, copy the following query:

      -- Create Dataset
      CREATE SCHEMA customer_dataset;
      
      -- Create Table
      CREATE TABLE IF NOT EXISTS customer_dataset.customer_data (
      id INT64, 
      person_name string,
      domain string,
      ssn string,
      country string,
      us_phone string, 
      address string, 
      account_id string, 
      zipcode string);
      
      -- Insert Data into table
      insert into customer_dataset.customer_data values (1, 'Nancy','nancy@yahoo.com','201-99-5532','US','856-232-9702','939 Park Avenue','159635478','33317');
      insert into customer_dataset.customer_data values (2,'Gene','gene@google.us','202-99-5532','UK','954-583-0575','303 Johnston Blvd','236854569','95202');
      insert into customer_dataset.customer_data values (3,'Edward','edward@facebook.com','203-99-5532','US','209-626-9041','130 Hollister','365412985','60173');
      insert into customer_dataset.customer_data values (4,'Pearlene','pearlene@gmail.com','204-99-5532','US','708-471-6810','17 Warren Rd','452189732','90017');
      insert into customer_dataset.customer_data values (5,'James','james@cuvox.de','205-99-5532','US','661-338-6787','898 Newport Gray Rd','517836427','94041');
      insert into customer_dataset.customer_data values (6,'Pamela','pamela@cuvox.de','206-99-5532','UK','650-526-5259','861 Strick Rd','685231473','80214');
      insert into customer_dataset.customer_data values (7,'Donna','donna@fleckens.hu','207-99-5532','US','303-239-4282','1784 S Shore Dr','789563258','1730');
      insert into customer_dataset.customer_data values (8,'Amy','amy@gustr.com','208-99-5532','US','774-553-4736','9522 Apple Valley Dr','854126945','55102');
      insert into customer_dataset.customer_data values (9,'Adam','adam@teleworm.us','209-99-5532','UK','651-297-1448','745 Old Springville Rd','965412381','43201');
      insert into customer_dataset.customer_data values (10,'Lucille','lucille@armyspy.com','210-99-5532','US','740-320-1270','4223  Midway Road','785651236','89102');
      insert into customer_dataset.customer_data values (11,'Edard','edu@gustr.com','211-99-5532','UK','702-257-8796','3659  Dye Street','965121354','53207');
      insert into customer_dataset.customer_data values (12,'Nick','nick@jourrapide.com','212-99-5532','US','414-483-8638','2966  Nutters Barn Lane','563515264','72764');
      insert into customer_dataset.customer_data values (13,'Brian','brian@einrot.com','213-99-5532','US','479-872-9783','3300  Worthington Drive','654621233','91303');
      insert into customer_dataset.customer_data values (14,'Stella','stella@jourrapide.com','214-99-5532','US','818-596-6681','1893  Ingram Road','261613654','35816');
      insert into customer_dataset.customer_data values (15,'Leona','leona@dayrep.com','215-99-5532','UK','256-250-5413','4244  Burnside Court','986513211','75069');
      
      -- Verify table by running select query
      SELECT * FROM customer_dataset.customer_data;
    3. Click Run.

  2. Create test user in Privacera Portal as emily. For more information see User Management.

    bigquery_portal_user.jpg
  3. In GCP console, the user emily gets added after you create the user in step 2.

    1. Check the user emily can list the projects inside your organization.

    2. Check user emily does not have Full Admin or Full Reader access on BigQuery.

  4. Create a policy for emily to run queries and list dataset and tables.

    bigquery_portal_policy_list_datasets_tables.jpg
  5. Check the access control on the test data in GCP.

    A) Table-level Access Control

    1. In Privacera Portal, create a policy Customer Data Full Access for accessing table.

      bigquery_portal_policy_table_access.jpg
    2. Log in to GCP console with credentials of the test user emily.

    3. Navigate to BigQuery.

    4. Run the following query.

      SELECT * FROM customer_dataset.customer_data;
      SELECT * FROM customer_dataset_secure.customer_data;

      User emily can access and view the data.

    5. In Privacera Portal, disable the policy.

    6. In GCP, run the following query.

      SELECT * FROM customer_dataset.customer_data;
      SELECT * FROM customer_dataset_secure.customer_data;

      User emily can not access and view the data.

    B) View-based Column-level Access Control

    In the view-based column-level access control, you have to create a column-level policy on the table. The columns which are not permitted from that policy will be shown as NULL in the secure view of the table.

    1. In Privacera Portal, do the following:

      Create a policy Customer Data Column Level Access granting access to a few columns.

      bigquery_portal_policy_column_secure_view_access.jpg
    2. Log in to GCP console with credentials of the test user emily.

    3. Navigate to BigQuery.

    4. Run the following queries.

      User emily will not see person_name column in the secure view of customer_data table..

      SELECT * FROM customer_dataset_secure.customer_data;
    5. In Privacera Portal, disable the Customer Data Column Level Access policy.

    C) View-based Row-level Filter

    1. In Privacera Portal, enable Customer Data Full Access policy created above.

      emily can view customer_data for US and UK from table and in secure view.

    2. Log in to GCP console with credentials of the test user emily.

    3. Navigate to BigQuery.

    4. Use the below query to view data from customer data secure view. It will show data from countries US and UK.

      SELECT * FROM customer_dataset_secure.customer_data;

      User emily can access and view the data.

    5. In Privacera Portal, create a Customer Access by Country policy to access data only from UK.

      bigquery_portal_policy_row_level_access.jpg
    6. Use the below query to view data from customer data secure view. The row filter policy gets applied showing data from country UK.

      SELECT * FROM customer_dataset_secure.customer_data;

    D) View-based Masking

    1. Log in to GCP console with credentials of the test user emily.

    2. Navigate to BigQuery.

    3. Run the following query.

      SELECT * FROM customer_dataset_secure.customer_data;

      User emily can view the SSN values.

    4. In Privacera Portal, create a Mask SSN policy to mask SSN values for emily.

      bigquery_portal_policy_mask_ssn.jpg
    5. Run the following query.

      SELECT * FROM customer_dataset_secure.customer_data;

      User emily cannot view the the SSN values, since it is masked using md5 encryption.

Create custom IAM roles

By default, PolicySync creates all the IAM roles listed in the table below automatically to perform access control in GBQ. If you want to create the custom IAM roles manually, then disable the BIGQUERY_CREATE_CUSTOM_IAM_ROLES by setting its value to false.

In the GCP console, you need the map the roles with its appropriate permissions. The role and its permissions mapping are given in the table below. For creating a custom role in GCP and adding its corresponding permissions, see Google documentation.

Role Name

GCP Permissions

PrivaceraGBQProjectListRole

resourcemanager.projects.get

PrivaceraGBQJobListRole

bigquery.jobs.list

PrivaceraGBQJobListAllRole

bigquery.jobs.listAll

PrivaceraGBQJobCreateRole

bigquery.jobs.create

PrivaceraGBQJobGetRole

bigquery.jobs.get

PrivaceraGBQJobUpdateRole

bigquery.jobs.update

PrivaceraGBQJobDeleteRole

bigquery.jobs.delete

PrivaceraGBQDatasetCreateRole

bigquery.datasets.create

PrivaceraGBQDatasetGetMetadataRole

bigquery.datasets.get

PrivaceraGBQDatasetUpdateRole

bigquery.datasets.update

PrivaceraGBQDatasetDeleteRole

bigquery.datasets.delete

PrivaceraGBQTableListRole

bigquery.tables.list

PrivaceraGBQTableCreateRole

bigquery.tables.create

PrivaceraGBQTableGetMetadataRole

bigquery.tables.get

PrivaceraGBQTableQueryRole

bigquery.tables.getData

PrivaceraGBQTableExportRole

bigquery.tables.export

PrivaceraGBQTableUpdateMetadataRole

bigquery.tables.update

PrivaceraGBQTableUpdateRole

bigquery.tables.updateData

PrivaceraGBQTableSetCategoryRole

bigquery.tables.setCategory

PrivaceraGBQTableDeleteRole

bigquery.tables.delete

PrivaceraGBQTransferUpdateRole

bigquery.transfers.update

PrivaceraGBQTransferGetRole

bigquery.transfers.get

Based on the GCP Resource hierarchy, you can create the roles above at the organization or project level. If you have multiple projects in your GCP organization, it is recommended to create all the roles at the organization level instead at the project level. Once the roles are defined at the organization level, all the projects under that organization will inherit all the roles.