Skip to main content

Privacera Platform

Table of Contents

Component services configurations

:

Access Management

Data Server
AWS
AWS Data Server
Configure Privacera Data Access Server

This section covers how you can configure Privacera Data Access Server.

CLI Configuration Steps
  1. SSH to the instance where Privacera Manager is installed.

  2. Run the following command.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.dataserver.aws.yml config/custom-vars/
    
  3. Edit the properties. For property details and description, refer to the Configuration properties below.

    vi config/custom-vars/vars.dataserver.aws.yml
    

    Note

    Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, click here.

  4. Run Privacera Manager update.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

DATASERVER_RANGER_AUTH_ENABLED

Enable/disable Ranger authorization in DataServer.

DATASERVER_V2_WORKDER_THREADS

Number of worker threads to process inbound connection.

20

DATASERVER_V2_CHANNEL_CONNECTION_BACKLOG

Maximum queue size for inbound connection.

128

DATASERVER_V2_CHANNEL_CONNECTION_POOL

Enable connection pool for outbound request. The property is disabled by default.

DATASERVER_V2_FRONT_CHANNEL_IDLE_TIMEOUT

Idle timeout for inbound connection.

60

DATASERVER_V2_BACK_CHANNEL_IDLE_TIMEOUT

Idle timeout for outbound connection and will take effect only if the connection pool enabled.

60

DATASERVER_HEAP_MIN_MEMORY_MB

Add the minimum Java Heap memory in MB used by Dataserver.

1024

DATASERVER_HEAP_MAX_MEMORY_MB

Add the maximum Java Heap memory in MB used by Dataserver.

1024

DATASERVER_USE_REGIONAL_ENDPOINT

Set this property to enforce default region for all S3 buckets.

true

DATASERVER_AWS_REGION

Default AWS region for S3 bucket.

us-east-1

AWS S3 data server

This section covers how you can configure access control for AWS S3 through Privacera Data Access Server.

Prerequisites

Ensure that the following prerequisites are met:

  • Create and add an AWS IAM Policy defined to allow access to S3 resources.

    Follow AWS IAM Create and Attach Policy instructions, using either "Full S3 Access" or "Limited S3 Access" policy templates, depending on your enterprise requirements.

    Return to this section once the Policy is attached to the Privacera Manager Host VM.

CLI configuration
  1. SSH to the instance where Privacera Manager is installed.

  2. Configure Privacera Data Server.

  3. Edit the properties. For property details and description, refer to the Configuration Properties below.

    vi config/custom-vars/vars.dataserver.aws.yml
    

    Note

    • In Kubernetes environment, enable DATASERVER_USE_POD_IAM_ROLE and DATASERVER_IAM_POLICY_ARN for using a specific IAM role for Dataserver pod. For property details and description, see S3 properties.

    • You can also add custom properties that are not included by default. See Dataserver.

  4. Run Privacera Manager update.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

DATASERVER_USE_POD_IAM_ROLE

Property to enable the creation of an IAM role that will be used for the Dataserver pod.

true

DATASERVER_IAM_POLICY_ARN

Full IAM policy ARN which needs to be attached to the IAM role associated with the Dataserver pod.

arn:aws:iam::aws:policy/AmazonS3FullAccess

DATASERVER_USE_IAM_ROLE

If you've given permission to an IAM role to access the bucket, enable **Use IAM Roles**.

DATASERVER_S3_AWS_API_KEY

If you've used a access to access the bucket, disable **Use IAM Role**, and set the AWS API Key.

AKIAIOSFODNN7EXAMPLE

DATASERVER_S3_AWS_SECRET_KEY

If you've used a secret key to access the bucket, disable **Use IAM Role**, and set the AWS Secret Key.

wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

DATASERVER_V2_S3_ENDPOINT_ENABLE

Enable to use a custom S3 endpoint.

DATASERVER_V2_S3_ENDPOINT_SSL

Property to enable/disable, if SSL is enabled/disabled on the MinIO server.

DATASERVER_V2_S3_ENDPOINT_HOST

Add the endpoint server host.

192.468.12.142

DATASERVER_V2_S3_ENDPOINT_PORT

Add the endpoint server port.

9000

DATASERVER_AWS_REQUEST_INCLUDE_USERINFO

Property to enable adding session role in CloudWatch logs for requests going via Dataserver.

This will be available with the **privacera-user** key in the Request Params of CloudWatch logs.

Set to true, if you want to see the **privacera-user** in CloudWatch.

true

AWS Athena data server

This section covers how you can configure access control for AWS Athena through Privacera Data Access Server.

Prerequisites

Ensure the following:

  • Create and add an AWS IAM Policy defined to allow rights to use Athena and Glue resources and databases.

    Follow AWS IAM Create and Attach Policy instructions, using the "Athena Access" policy modified as necessary for your enterprise. Return to this section once the Policy is attached to the Privacera Manager Host VM.

CLI configuration
  1. SSH to the instance where Privacera Manager is installed.

  2. Configure Privacera Data Server.

  3. Edit the properties. For property details and description, refer to the Configuration Properties below.

    vi config/custom-vars/vars.dataserver.aws.yml
    

    Note

    Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, click here.

  4. Run Privacera Manager update.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Identify an existing S3 bucket or create one to store the Athena query results.

AWS_ATHENA_RESULT_STORAGE_URL: "s3://${S3_BUCKET_FOR_QUERY_RESULTS}/athena-query-results/index.html"
Azure
Azure ADLS Data Server

This topic covers integration of Azure Data Lake Storage (ADLS) with the Privacera Platform using Privacera Data Access Server.

Prerequisites

Ensure that the following prerequisites are met:

  • You have access to an Azure Storage account along with required credentials.

    For more information on how to set up an Azure storage account, refer to Azure Storage Account Creation.

  • Get the values for the following Azure properties: Application (client) ID, Client secrets

CLI Configuration
  1. Go to the privacera-manager folder in your virtual machine. Open the config folder, copy the sample vars.dataserver.azure.yml file to the custom-vars/ folder, and edit it.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.dataserver.azure.yml config/custom-vars/
    vi custom-vars/vars.dataserver.azure.yml
    
  2. Edit the Azure-related information. For property details and description, click here.

    1. If you want to use Azure CLI, use the following properties:

      ENABLE_AZURE_CLI: "true"
      AZURE_GEN2_SHARED_KEY_AUTH: "true"
      AZURE_ACCOUNT_NAME: "<PLEASE_CHANGE>"
      AZURE_SHARED_KEY: "<PLEASE_CHANGE>"
      
    2. If you want to access multiple Azure storage accounts with shared key authentication, use the following properties:

      AZURE_GEN2_SHARED_KEY_AUTH: "true"
      AZURE_ACCT_SHARED_KEY_PAIRS: "<PLEASE_CHANGE>"
      

      Note

      Configuring AZURE_GEN2_SHARED_KEY_AUTH property allows you to access the resources in the Azure accounts only through the File Explorer in Privacera Portal.

    3. If you want to access multiple azure storage accounts with OAuth application based authentication, use the following property:

      AZURE_GEN2_SHARED_KEY_AUTH: "false"
      AZURE_TENANTID: "<PLEASE_CHANGE>"
      AZURE_SUBSCRIPTION_ID: "<PLEASE_CHANGE>"
      AZURE_RESOURCE_GROUP: "<PLEASE_CHANGE>"
      DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST:
       - index: 0
         clientId: "<PLEASE_CHANGE>"
         clientSecret: "<PLEASE_CHANGE>"
         storageAccName: "<PLEASE_CHANGE>"

      Note

      Configuring AZURE_GEN2_SHARED_KEY_AUTH property allows you to access the resources in the Azure accounts only through the File Explorer in Privacera Portal.

      Note

      You can also add custom properties that are not included by default. See Dataserver.

  3. Run the following command.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration Properties

Property Name

Description

Example

ENABLE_AZURE_CLI

Uncomment to use Azure CLI.

The AZURE_ACCT_SHARED_KEY_PAIRS property wouldn't work with this property. So, you have set the AZURE_ACCOUNT_NAME and AZURE_SHARED_KEY properties.

true

AZURE_GEN2_SHARED_KEY_AUTH

For AZURE_GEN2_SHARED_KEY_AUTH property, use shared key authentication. Set it to true.

To use multiple Azure storage accounts with shared key authentication, then set this property to true, along with AZURE_ACCT_SHARED_KEY_PAIRS.

To use multiple Azure storage accounts with OAuth authentication, then set this property to false, along with DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST.

true

AZURE_ACCOUNT_NAME

Azure ADLS storage account name

company-qa-dept

AZURE_SHARED_KEY

Azure ADLS storage account shared access key

=0Ty4br:2BIasz>rXm{cqtP8hA;7|TgZZZuTHJTg40z8E5z4UJ':roeJy=d7*/W"

AZURE_ACCT_SHARED_KEY_PAIRS

Comma-separated multiple storage account names and its shared keys.

The format must be ${storage_account_name_1}:${secret_key_1},${storage_account_name_2}:${secret_key_2}

accA:sharedKeyA, accB:sharedKeyB

AZURE_TENANTID

To get the value for this property, Go to Azure portal > Azure Active Directory > Propertie > Tenant ID

5a5cxxx-xxxx-xxxx-xxxx-c3172b33xxxx

AZURE_APP_CLIENT_ID

Get the value by following the Pre-requisites section above.

8c08xxxx-xxxx-xxxx-xxxx-6w0c95v0xxxx

AZURE_SUBSCRIPTION_ID

To get the value for this property, Go to Azure portal > Select Subscriptions in the left sidebar > Select whichever subscription is needed &gt; Click on overview &gt; Copy the Subscription ID

27e8xxxx-xxxx-xxxx-xxxx-c716258wxxxx

AZURE_RESOURCE_GROUP

To get the value for this property, Go to Azure portal > Storage accounts > Select the storage account you want to configure >Click on Overview > Resource Group

privacera

DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST:
 - index: 0
   clientId: "<PLEASE_CHANGE>"
   clientSecret: "<PLEASE_CHANGE>"
   storageAccName: "<PLEASE_CHANGE>"
                                    

Configure multiple OAuth Azure applications and the storage accounts mapped with the configured client id.

**Note**: The ‘clientSecret’ property must be in BASE64 format in the YAML file.

DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST: 
 - index: 0
   clientId: "8c08xxxx-xxxx-xxxx-xxxx-6w0c95v0xxxx"
   clientSecret: "WncwSaMpleRZ1ZoLThJYWpZd3YzMkFJNEljZGdVN0FfVAo="
   storageAccName: "storageAccA,storageAccB"
 - index: 1
   clientId: "5d37xxxx-xxxx-xxxx-xxxx-7z0cu7e0xxxx"
   clientSecret: "ZncwSaMpleRZ1ZoLThJYWpZd3YzMkFJNEljZGdVN0FfVAo="
   storageAccName: "storageAccC"  
Validation

All-access or attempted access (Allowed and Denied) for Azure ADLS resources will now be recorded to the audit stream. This Audit stream can be reviewed in the Audit page of the Privacera Access Manager. Default access for a data repository is 'Denied' so all data access will be denied.

To verify Privacera Data Management control, perform the following steps:

  1. Login to Privacera Portal, as a portal administrator, open Data Inventory: Data Explorer, and attempt to view the targeted ADLS files or folders. The data will be hidden and a Denied status will be registered in the Audit page.

  2. In Privacera Portal, open Access Management: Resource Policies. Open System 'ADLS' and 'application' (data repository) 'privacera_adls'. Create or modify an access policy to allow access to some or all of your ADLS storage.

  3. Return to Data Inventory: Data Explorer and re-attempt to view the data as allowed by your new policy or policy change. Repeat step 1.

    You should be able to view files or folders in the account, and an Allowed status will be registered in the Audit page.

To check the log in the Audit page in Privacera Portal, perform the following steps:

  1. On the Privacera Portal page, expand Access Management and click the Auditfrom the left menu.

  2. The Audit page will be displayed with Ranger Audit details.

GCP Data Server

This topic covers integration of Google Cloud Storage (GCS) and Google BigQuery (GBQ) with the Privacera Platform using Privacera Data Access Server.

Prerequisites

Ensure that the following prerequisites are met:

  • If GCS is being configured, then you need access to an Google Cloud Storage account along with required credentials.

  • If GBQ is being configured, then you need access to an Google Cloud BigQuery account along with required credentials.

  • Get the credential file (JSON) associated with the service account by downloading it.

CLI Configuration
  1. SSH to the instance where Privacera is installed.

  2. Copy the credential file you've downloaded from your machine to a location on your instance where Privacera Manager is configured. Get the file path of the JSON file and add it in the next step.

  3. Run the following commands.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.dataserver.gcp.yml config/custom-vars/
    vi config/custom-vars/vars.dataserver.gcp.yml
  4. Update the following credential file information.

    GCP_CREDENTIAL_FILE_PATH: "/tmp/my_google_credential.json"

    Note

    You can also add custom properties that are not included by default. See Dataserver.

  5. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update

    After the update is completed, Privacera gets installed and a default GCS data source is created.

  6. Add GCS Project ID in the GCS data source.

    1. Navigate to Portal UI > Settings > Data Source Registration and edit GOOGLE_CLOUD_STORAGE.

    2. Click Application Properties and add the following properties:

      • Credential Type: Select Google Credentials Local File Path from the dropdown list.

      • Google Credentials Local File Path: Set value to None.

      • Google Project Id: Enter your Google Project ID.

    3. To view the buckets, navigate to Data Inventory > File Explorer.

      If you can not view the buckets, restart Dataserver.

      cd  privacera/privacera-manager
      ./privacera-manager.sh restart dataserver

Tip

You can use Google APIs to apply access control on GCS. For more information, click here.

UserSync
Privacera UserSync
Privacera Data Access User Synchronization

Learn how you can synchronize users and groups from different connectors.

LDAP
  1. Run the following command to enable Privacera UserSync:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
  2. Enable the LDAP connector:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.ldap.yml config/custom-vars/ 
    vi config/custom-vars/vars.privacera-usersync.ldap.yml

    Edit the following properties:

    Property

    Description

    Example

    A) LDAP Connector Info

    LDAP_CONNECTOR

    Name of the connector.

    ad

    LDAP_ENABLED

    Enabled status of connector: true or false

    true

    LDAP_SERVICE_TYPE

    Set a service type: ldap or ad

    ad

    LDAP_DATASOURCE_NAME

    Name of the datasource: ldap or ad

    ad

    LDAP_URL

    URL of source LDAP.

    ldap://example.us:389

    LDAP_BIND_DN

    Property is used to connect to LDAP and then query for users and groups.

    CN=Example User,OU=sales,DC=ad,DC=sales,DC=us

    LDAP_BIND_PASSWORD

    LDAP bind password for the bind DN specified above.

    LDAP_AUTH_TYPE

    Authentication type, the default is simple

    simple

    LDAP_REFERRAL

    Set the LDAP context referral: ignore or follow.

    Default value is follow.

    follow

    LDAP_SYNC_INTERVAL

    Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.

    3600

    B) Enable SSL for LDAP Server

    Note

    Support Chain SSL - Preview Functionality

    Previously Privacera services were only using one SSL certificate of LDAP server even if a chain of certificates was available. Now as a Preview functionality, all the certificates which are available in the chain certificate are imported it into the truststore. This is added for Privacera usersync, Ranger usersync and portal SSL certificates.

    PRIVACERA_USERSYNC_SYNC_LDAP_SSL_ENABLED

    Set this property to enable/disable SSL for Privacera Usersync.

    true

    PRIVACERA_USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS

    Set this property if you want Privacera Manager to generate a truststore for your SSL-enabled LDAP server.

    true

    PRIVACERA_USERSYNC_AUTH_SSL_ENABLED

    Set this property if the other Privacera services are not SSL enabled and you are using SSL-enabled LDAP server.

    true

    C) LDAP Search

    LDAP_SEARCH_GROUP_FIRST

    Property to enable to search for groups first, before searching for users.

    true

    LDAP_SEARCH_BASE

    Search base for users and groups.

    DC=ad,DC=sales,DC=us

    LDAP_SEARCH_USER_BASE

    Search base for users.

    ou=example,dc=ad,dc=sales,dc=us

    LDAP_SEARCH_USER_SCOPE

    Set the value for search scope for the users: base, one or sub.

    Default value is sub.

    sub

    LDAP_SEARCH_USER_FILTER

    Optional additional filter constraining the users selected for syncing.

    LDAP_SEARCH_USER_GROUPONLY

    Boolean to only load users in groups.

    false

    LDAP_ATTRIBUTE_ONLY

    Sync only the attributes of users already synced from other services.

    false

    LDAP_SEARCH_INCREMENTAL_ENABLED

    Enable incremental search. Syncing changes only since last search.

    false

    LDAP_PAGED_RESULTS_ENABLED

    Enable paged results control for LDAP Searches. Default is true.

    true

    LDAP_PAGED_CONTROL_CRITICAL

    Set paged results control criticality to CRITICAL. Default is true.

    true

    LDAP_SEARCH_GROUP_BASE

    Search base for groups.

    ou=example,dc=ad,dc=sales,dc=us

    LDAP_SEARCH_GROUP_SCOPE

    Set the value for search scope for the groups: base, one or sub.

    Default value is sub.

    sub

    LDAP_SEARCH_GROUP_FILTER

    Optional additional filter constraining the groups selected for syncing.

    LDAP_SEARCH_CYCLES_BETWEEN_DELETED_DETECTION

    Numeric number of cycles between deleted searches. Default value is 6.

    6

    LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS

    Enables both user and group deleted searches. Default is false.

    false

    LDAP_SEARCH_DETECT_DELETED_USERS

    Override setting for user deleted search. Default value is LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS.

    LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS

    LDAP_SEARCH_DETECT_DELETED_GROUPS

    Override setting for group deleted search. Default value is LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS.

    LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS

    D) LDAP Manage/Ignore List of Users/Groups

    LDAP_MANAGE_USER_LIST

    List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.

    LDAP_IGNORE_USER_LIST

    List of users to ignore from sync results.

    LDAP_MANAGE_GROUP_LIST

    List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.

    LDAP_IGNORE_GROUP_LIST

    List of groups to ignore from sync results.

    E) LDAP Object Users/Groups Class

    LDAP_OBJECT_USER_CLASS

    Objectclass to identify user entries.

    user

    LDAP_OBJECT_GROUP_CLASS

    Objectclass to identify group entries.

    group

    F) LDAP User/Group Attributes

    LDAP_ATTRIBUTE_USERNAME

    Attribute from user entry that would be treated as user name.

    SAMAccountName

    LDAP_ATTRIBUTE_FIRSTNAME

    Attribute of a user’s first name. The default is givenName.

    givenName

    LDAP_ATTRIBUTE_LASTNAME

    Attribute of a user’s last name.

    LDAP_ATTRIBUTE_EMAIL

    Attribute from user entry that would be treated as email address.

    mail

    LDAP_ATTRIBUTE_GROUPNAMES

    List of attributes from group entry that would be treated as group name.

    LDAP_ATTRIBUTE_GROUPNAME

    Attribute from group entry that would be treated as group name.

    name

    LDAP_ATTRIBUTE_GROUP_MEMBER

    Attribute from group entry that is list of members.

    member

    G) Username/Group name Attribute Modification

    LDAP_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL

    Extract username from an email address. (e.g. username@domain.com -> username) Default is false.

    false

    LDAP_ATTRIBUTE_USERNAME_VALUE_PREFIX

    Prefix to prepend to the username. Default is blank.

    LDAP_ATTRIBUTE_USERNAME_VALUE_POSTFIX

    Postfix to append pend to the username. Default is blank.

    LDAP_ATTRIBUTE_USERNAME_VALUE_TOLOWER

    Convert the username to lowercase. Default is false.

    false

    LDAP_ATTRIBUTE_USERNAME_VALUE_TOUPPER

    Convert the username to uppercase. Default is false.

    false

    LDAP_ATTRIBUTE_USERNAME_VALUE_REGEX

    Attribute to replace username to matching regex. Default is blank.

    LDAP_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL

    Extract the group name from an email address. Default is false.

    false

    LDAP_ATTRIBUTE_GROUPNAME_VALUE_PREFIX

    Prefix to prepend to the group's name. Default is blank.

    LDAP_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX

    Postfix to append pend to the group's name. Default is blank.

    LDAP_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER

    Convert the name to group's name to lower case. Default is false.

    false

    LDAP_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER

    Convert the group's name to uppercase. Default is false.

    false

    LDAP_ATTRIBUTE_GROUPNAME_VALUE_REGEX

    Attribute to replace the group's name to matching regex. Default is blank.

    H) Group Attribute Configuration

    LDAP_GROUP_ATTRIBUTE_LIST

    The list of attribute keys to get from synced groups.

    LDAP_GROUP_ATTRIBUTE_VALUE_PREFIX

    Append prefix to values of group attributes such as group name.

    LDAP_GROUP_ATTRIBUTE_KEY_PREFIX

    Append prefix to key of group attributes such as group name.

    LDAP_GROUP_LEVELS

    Configure Privacera usersync with AD/LDAP nested group membership.

  3. Run the following command:

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
LDAP/AD deleted entity detection

When enabled, LDAP/AD deleted entity detection will perform a soft delete of users or groups in Privacera Portal. A soft delete removes all memberships of the group/user and marks them as “hidden”. Hidden users will not appear in auto completion when modifying access policies. References to users/groups in policies will remain, until manually removed or the user/group is fully deleted from Privacera Portal. Hidden users can be fully deleted by using the Privacera Portal UI or REST APIs.

Properties:

  • Boolean: usersync.connector.0.search.deleted.group.enabled (default: false)

  • Boolean: usersync.connector.0.search.deleted.user.enabled (default: false)

  • Numeric: usersync.connector.#.search.deleted.cycles (default: 6)

Privacera Manager Variables:

In the LDAP connector properties table above, see under User Search (section C).

Azure Active Directory (AAD)
  1. Run the following command to enable Privacera UserSync:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
  2. Enable the AAD connector:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.azuread.yml config/custom-vars/ 
    vi config/custom-vars/vars.privacera-usersync.azuread.yml

    Edit the following properties:

    Property

    Description

    Example

    A) AAD Basic Info

    AZURE_AD_CONNECTOR

    Name of the connector.

    AAD1

    AZURE_AD_ENABLED

    Enabled status of connector. (true/false)

    true

    AZURE_AD_SERVICE_TYPE

    Service Type

    AZURE_AD_DATASOURCE_NAME

    Name of the datasource.

    AZURE_AD_ATTRIBUTE_ONLY

    Sync only the attributes of users already synced from other services.

    false

    AZURE_AD_SYNC_INTERVAL

    Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.

    3600

    B) Azure AAD Info: (Get the following information from Azure Portal)

    AZURE_AD_TENANT_ID

    Azure Active Directory Id (Tenant ID)

    1a2b3c4d-azyd-4755-9638-e12xa34p56le

    AZURE_AD_CLIENT_ID

    Azure Active Directory application client ID which will be used for accessing Microsoft Graph API.

    11111111-1111-1111-1111-111111111111

    AZURE_AD_CLIENT_SECRET

    Azure Active Directory application client secret which will be used for accessing Microsoft Graph API.

    AZURE_AD_USERNAME

    Azure Account username which will be used for getting access token to be used on behalf of Azure AD application.

    AZURE_AD_PASSWORD

    Azure Account password which will be used for getting access token to be used on behalf of Azure AD application.

    C) AAD Manage/Ignore List of Users/Groups

    AZURE_AD_MANAGER_USER_LIST

    List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.

    AZURE_AD_IGNORE_USER_LIST

    List of users to ignore from sync results.

    AZURE_AD_MANAGE_GROUP_LIST

    List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.

    AZURE_AD_IGNORE_GROUP_LIST

    List of groups to ignore from sync results.

    D) AAD Search

    AZURE_AD_SEARCH_SCOPE

    Azure AD Application Access Scope

    AZURE_AD_SEARCH_USER_GROUPONLY

    Boolean to only load users in groups.

    false

    AZURE_AD_SEARCH_INCREMENTAL_ENABLED

    Enable incremental search. Syncing only changes since last search.

    false

    AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS

    Enables both user and group deleted searches. Default is false.

    false

    AZURE_AD_SEARCH_DETECT_DELETED_USERS

    Override setting for user deleted search. Default value is AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS.

    AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS

    AZURE_AD_SEARCH_DETECT_DELETED_GROUPS

    Override setting for group deleted search. Default value is AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS.

    AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS

    E) Azure Service Principal

    Note

    If Sync Service Principals as Users is enabled, AAD does not require that displayName of a Service Principal be a unique value. In this case a different attribute (such as appId) should be used as the Service Principal Username.

    AZURE_AD_SERVICEPRINCIPAL_ENABLED

    Sync Azure service principal to ranger user entity.

    false

    AZURE_AD_SERVICEPRINCIPAL_USERNAME

    Properties to specify from which key to get values of username in case service principal is mapped to Ranger user entity.

    displayName

    F) AAD User/Group Attributes

    AZURE_AD_ATTRIBUTE_USERNAME

    Attribute of a user’s name (default: userPrincipalName)

    AZURE_AD_ATTRIBUTE_FIRSTNAME

    Attribute of a user’s first name (default: givenName)

    AZURE_AD_ATTRIBUTE_LASTNAME

    Attribute of a user’s last name (default: surname)

    AZURE_AD_ATTRIBUTE_EMAIL

    Attribute from user entry that would be treated as email address.

    AZURE_AD_ATTRIBUTE_GROUPNAME

    Attribute from group entry that would be treated as group name.

    AZURE_AD_SERVICEPRINCIPAL_USERNAME

    Attribute of service principal name.

    G) Username/Group name Attribute Modification

    AZURE_AD_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL

    Extract username from an email address. (e.g. username@domain.com -> username) Default is false.

    false

    AZURE_AD_ATTRIBUTE_USERNAME_VALUE_PREFIX

    Prefix to prepend to the username. Default is blank.

    AZURE_AD_ATTRIBUTE_USERNAME_VALUE_POSTFIX

    Postfix to append pend to the username. Default is blank.

    AZURE_AD_ATTRIBUTE_USERNAME_VALUE_TOLOWER

    Convert the username to lowercase. Default is false.

    false

    AZURE_AD_ATTRIBUTE_USERNAME_VALUE_TOUPPER

    Convert the username to uppercase. Default is false.

    false

    AZURE_AD_ATTRIBUTE_USERNAME_VALUE_REGEX

    Attribute to replace username to matching regex. Default is blank.

    AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL

    Extract the group name from an email address. Default is false.

    false

    AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_PREFIX

    Prefix to prepend to the group's name. Default is blank.

    AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX

    Postfix to append pend to the group's name. Default is blank.

    AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER

    Convert the name to group's name to lower case. Default is false.

    false

    AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER

    Convert the group's name to uppercase. Default is false.

    false

    AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_REGEX

    Attribute to replace the group's name to matching regex. Default is blank.

    H) Group Attribute Configuration

    AZURE_AD_GROUP_ATTRIBUTE_LIST

    The list of attribute keys to get from synced groups.

    AZURE_AD_GROUP_ATTRIBUTE_VALUE_PREFIX

    Append prefix to values of group attributes such as group name.

    AZURE_AD_GROUP_ATTRIBUTE_KEY_PREFIX

    Append prefix to key of group attributes such as group name.

    I) Filter Properties

    AZURE_AD_FILTER_USER_LIST

    Filter the AAD user list, supported for non-incremental search. When incremental search is enabled delta search does not support filter properties.

    abc.def@privacera.com

    AZURE_AD_FILTER_SERVICEPRINCIPAL_LIST

    Filter the AAD service principal list, supported for non-incremental search. When incremental search is enabled delta search does not support filter properties.

    abc-testapp

    AZURE_AD_FILTER_GROUP_LIST

    Filter the AAD group list, supported for non-incremental search. When incremental search is enabled delta search does not support filter properties.

    PRIVACERA-AB-GROUP-00

    J) Domain Properties

    AZURE_AD_MANAGE_DOMAIN_LIST

    Only users in manage domain list will be synced.

    Privacera.US

    AZURE_AD_IGNORE_DOMAIN_LIST

    Users in ignore domain list will not be synced.

    Privacera.US

    AZURE_AD_DOMAIN_ATTRIBUTE

    Specify the attribute from which you want to compare user domain, email or username are supported. Default is email.

    username

  3. Run the following command:

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
Azure Active Directory (AAD) deleted entity detection

When enabled, AAD deleted entity detection will perform a soft delete of users or groups in Privacera Portal. A soft delete removes all memberships of the group/user and marks them as “hidden”. Hidden users will not appear in auto completion when modifying access policies. References to users/groups in policies will remain, until manually removed or the user/group is fully deleted from Privacera Portal. Hidden users can be fully deleted by using the Privacera Portal UI or REST APIs.

Properties:

  • Boolean: usersync.connector.3.search.deleted.group.enabled (default: false)

  • Boolean: usersync.connector.3.search.deleted.user.enabled (default: false)

Privacera Manager Variables:

In the AAD connector properties table above, see under AAD Search (section D).

SCIM
  1. Run the following command to enable Privacera UserSync:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
  2. Enable the SCIM connector:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.scim.yml config/custom-vars/ 
    vi config/custom-vars/vars.privacera-usersync.scim.yml

    Edit the following properties:

    Property

    Description

    Example

    A) SCIM Connector Info

    SCIM_CONNECTOR

    Name of connector.

    DB1

    SCIM_ENABLED

    Enabled status of connector. (true/false)

    true

    SCIM_SERVICETYPE

    Service Type

    scim

    SCIM_DATASOURCE_NAME

    Name of the datasource.

    databricks1

    SCIM_URL

    Connector URL

    ADMIN_USER_BEARER_TOKEN

    Bearer token

    SCIM_SYNC_INTERVAL

    Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.

    3600

    B) SCIM Manage/Ignore List of Users/Groups

    SCIM_MANAGE_USER_LIST

    List of users to manage from sync results. If this list is defined, all users not on this list will be ignored

    SCIM_IGNORE_USER_LIST

    List of users to ignore from sync results.

    SCIM_MANAGE_GROUP_LIST

    List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.

    SCIM_IGNORE_GROUP_LIST

    List of groups to ignore from sync results.

    C) SCIM User/Group Attributes

    SCIM_ATTRIBUTE_USERNAME

    Attribute from user entry that would be treated as user name.

    userName

    SCIM_ATTRIBUTE_FIRSTNAME

    Attribute from user entry that would be treated as firstname.

    name.givenName

    SCIM_ATTRIBUTE_LASTNAME

    Attribute from user entry that would be treated as lastname.

    name.familyName

    SCIM_ATTRIBUTE_EMAIL

    Attribute from user entry that would be treated as email address.

    emails[primary-true].value

    SCIM_ATTRIBUTE_ONLY

    Sync only the attributes of users already synced from other services. (true/false)

    false

    SCIM_ATTRIBUTE_GROUPS

    Attribute of user’s group list.

    groups

    SCIM_ATTRIBUTE_GROUPNAME

    Attribute from group entry that would be treated as group name.

    displayName

    SCIM_ATTRIBUTE_GROUP_MEMBER

    Attribute from group entry that is list of members.

    members

    D) SCIM Server Username Attribute Modifications

    SCIM_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL

    Extract the user’s username from an email address. (e.g. username@domain.com -> username) The default is false.

    false

    SCIM_ATTRIBUTE_USERNAME_VALUE_PREFIX

    Prefix to prepend to username. The default is blank.

    SCIM_ATTRIBUTE_USERNAME_VALUE_POSTFIX

    Postfix to append to the username. The default is blank.

    SCIM_ATTRIBUTE_USERNAME_VALUE_TOLOWER

    Convert the user’s username to lowercase. The default is false.

    false

    SCIM_ATTRIBUTE_USERNAME_VALUE_TOUPPER

    Convert the user’s username to uppercase. The default is false.

    false

    SCIM_ATTRIBUTE_USERNAME_VALUE_REGEX

    Attribute to replace username to matching regex. The default is blank.

    E) SCIM Server Group Name Attribute Modifications

    SCIM_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL

    Extract the group’s name from an email address (e.g. groupname@domain.com -> groupname). The default is false.

    false

    SCIM_ATTRIBUTE_GROUPNAME_VALUE_PREFIX

    Prefix to prepend to the group's name. The default is blank.

    SCIM_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX

    Postfix to append to the group's name. The default is blank.

    SCIM_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER

    Convert group's name to lowercase. The default is false.

    false

    SCIM_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER

    Convert the group's name to uppercase. The default is false.

    false

    SCIM_ATTRIBUTE_GROUPNAME_VALUE_REGEX

    Attribute to replace group's name to matching regex. The default is blank.

    F) Group Attribute Configuration

    SCIM_GROUP_ATTRIBUTE_LIST

    The list of attribute keys to get from synced groups.

    SCIM_GROUP_ATTRIBUTE_VALUE_PREFIX

    Append prefix to values of group attributes such as group name.

    SCIM_GROUP_ATTRIBUTE_KEY_PREFIX

    Append prefix to key of group attributes such as group name.

  3. Run the following command:

    cd ~/privacera/privacera-manager ./privacera-manager.sh update
SCIM Server

Note

SCIM Server exposes privacera-usersync service externally on a Public/Internet-facing LB.

  1. Run the following command to enable Privacera UserSync:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
  2. Enable the SCIM Server connector:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.scimserver.yml config/custom-vars/ 
    vi config/custom-vars/vars.privacera-usersync.scimserver.yml

    Edit the following properties:

    Property

    Description

    Example

    A) SCIM Server Connector Info

    SCIM_SERVER_CONNECTOR

    Identifying name of this connector.

    DB1

    SCIM_SERVER_ENABLED

    Enabled status of connector. (true/false)

    true

    SCIM_SERVER_SERVICETYPE

    Type of service/connector.

    scimserver

    SCIM_SERVER_DATASOURCE_NAME

    Unique datasource name. Used for identifying source of data and configuring priority list. (Optional)

    databricks1

    SCIM_SERVER_ATTRIBUTE_ONLY

    Sync only the attributes of users already synced from other services. (true/false)

    SCIM_SERVER_BEARER_TOKEN

    Bearer token for auth to SCIM API. When set, SCIM requests with this token will be allowed access.

    SCIM_SERVER_USERNAME

    Basic auth username, when set SCIM requests with this username will be allowed access. (Password also required)

    SCIM_SERVER_PASSWORD

    Basic auth password, when set SCIM requests with this password will be allowed access. (Username also required)

    SCIM_SERVER_SYNC_INTERVAL

    Frequency of usersync audit records in seconds. Default value is 3600, minimum value is 300.

    3600

    B) SCIM Server Manage/Ignore List of Users/Groups

    SCIM_SERVER_MANAGE_USER_LIST

    List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.

    SCIM_SERVER_IGNORE_USER_LIST

    List of users to ignore from sync results.

    SCIM_SERVER_MANAGE_GROUP_LIST

    List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.

    SCIM_SERVER_IGNORE_GROUP_LIST

    List of groups to ignore from sync results.

    C) SCIM Server Attributes

    SCIM_SERVER_ATTRIBUTE_USERNAME

    Attribute of a user's name.

    userName

    SCIM_SERVER_ATTRIBUTE_FIRSTNAME

    Attribute of a user's first name.

    name.givenName

    SCIM_SERVER_ATTRIBUTE_LASTNAME

    Attribute of a user's last/family name.

    name.familyName

    SCIM_SERVER_ATTRIBUTE_EMAIL

    Attribute of a user’s email.

    emails[primary-true].value

    SCIM_SERVER_ATTRIBUTE_GROUPS

    Attribute of a user’s group list.

    groups

    SCIM_SERVER_ATTRIBUTE_GROUPNAME

    Attribute of a group's name.

    displayName

    SCIM_SERVER_ATTRIBUTE_GROUP_MEMBER

    Attribute from group entry that is the list of members.

    members

    D) SCIM Server Username Attribute Modifications

    SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL

    Extract the user’s username from an email address. (e.g. username@domain.com -> username) The default is false.

    false

    SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_PREFIX

    Prefix to prepend to username. The default is blank.

    SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_POSTFIX

    Postfix to append to the username. The default is blank.

    SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_TOLOWER

    Convert the user’s username to lowercase. The default is false.

    false

    SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_TOUPPER

    Convert the user’s username to uppercase. The default is false.

    false

    SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_REGEX

    Attribute to replace username to matching regex. The default is blank.

    E) SCIM Server Group Name Attribute Modifications

    SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL

    Extract the group’s name from an email address (e.g. groupname@domain.com -> groupname). The default is false.

    false

    SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_PREFIX

    Prefix to prepend to the group's name. The default is blank.

    SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX

    Postfix to append to the group's name. The default is blank.

    SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER

    Convert group's name to lowercase. The default is false.

    false

    SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER

    Convert the group's name to uppercase. The default is false.

    false

    SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_REGEX

    Attribute to replace group's name to matching regex. The default is blank.

    F) Group Attribute Configuration

    SCIM_SERVER_GROUP_ATTRIBUTE_LIST

    The list of attribute keys to get from synced groups.

    SCIM_SERVER_GROUP_ATTRIBUTE_VALUE_PREFIX

    Append prefix to values of group attributes such as group name.

    SCIM_SERVER_GROUP_ATTRIBUTE_KEY_PREFIX

    Append prefix to key of group attributes such as group name.

  3. If NGINX Ingress is Enabled, and NGINX controller is running on Internal LB, ensure to disable the ingress for Usersync so that it can pick a Public/Internet facing LB by adding the below variable:

    vi config/custom-vars/vars.kubernetes.nginx-ingress.yml
    
    PRIVACERA_USERSYNC_K8S_NGINX_INGRESS_ENABLE: “false”
  4. Run the following command:

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
OKTA
  1. Run the following command to enable Privacera UserSync:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
  2. Enable the OKTA connector:

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.privacera-usersync.okta.yml config/custom-vars/ 
    vi config/custom-vars/vars.privacera-usersync.okta.yml

    Edit the following properties:

    Property

    Description

    Example

    A) OKTA Connector Info

    OKTA_CONNECTOR

    Name of the connector.

    OKTA

    OKTA_ENABLED

    Enabled status of connector. (true/false)

    true

    OKTA_SERVICETYPE

    Type of service/connector.

    okta

    OKTA_DATASOURCE_NAME

    Unique datasource name, used for identifying source of data and configuring priority list. (Optional)

    OKTA_SERVICE_URL

    Connector URL

    https://{myOktaDomain}.okta.com

    OKTA_API_TOKEN

    API token

    A8b2c84d-895a-4fea-82dc-401397b8e50c

    OKTA_SYNC_INTERVAL

    Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.

    3600

    B) OKTA Manage/Ignore List of Users/Groups

    OKTA_USER_LIST

    List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.

    OKTA_IGNORE_USER_LIST

    List of users to ignore from sync results.

    OKTA_USER_LIST_STATUS

    List of users to manage with status as equal to: STAGED, PROVISIONED,ACTIVE,RECOVERY,PASSWORD_EXPIRED,LOCKED_OUT or DEPROVISIONED. If this list is defined, all users not on this list will be ignored.

    ACTIVE,STAGED

    OKTA_USER_LIST_LOGIN

    List of users to manage with user login name (can contain ). If this list is defined, all users not on this list will be ignored.

    sw;mon,san

    OKTA_USER_LIST_PROFILE_FIRSTNAME

    List of users to manage with user first name (can contain ). If this list is defined, all users not on this list will be ignored.

    sw;mon,san

    OKTA_USER_LIST_PROFILE_LASTNAME

    List of users to manage with user last name (can contain ). If this list is defined, all users not on this list will be ignored.

    sw;mon,san

    OKTA_LIST_PROFILE_EMAIL

    List of users to manage with user email (can contain ). If this list is defined, all users not on this list will be ignored.

    sw;mon,san

    OKTA_LIST_TYPE

    List of groups to manage with group type. If this list is defined, all groups not on this list will be ignored.

    APP_GROUP,BUILT_IN,OKTA_GROUP

    OKTA_GROUP_LIST

    List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.

    OKTA_IGNORE_GROUP_LIST

    List of groups to ignore from sync results.

    OKTA_GROUP_LIST_SOURCE_ID

    List of groups to manage with group source id. If this list is defined, all groups not on this list will be ignored.

    0oa2v0el0gP90aqjJ0g7,0oa2v0el0gP90aqjJ0g8,0oa2v0el0gP90aqjJ0g0

    OKTA_GROUP_LIST_PROFILE_NAME

    List of groups to manage with group name. If this list is defined, all groups not on this list will be ignored.

    group1,testGroup,testGroup2

    C) OKTA Search

    OKTA_SEARCH_USER_GROUPONLY

    Boolean to only load users in groups.

    false

    OKTA_SEARCH_INCREMENTAL_ENABLED

    Boolean to enable incremental search, syncing only changes since last search.

    false

    D) OKTA User/Group Attributes

    OKTA_ATTRIBUTE_USERNAME

    Attribute from user entry that would be treated as user name.

    login

    OKTA_ATTRIBUTE_FIRSTNAME

    Attribute from user entry that would be treated as firstname.

    firstName

    OKTA_ATTRIBUTE_LASTNAME

    Attribute from user entry that would be treated as lastname.

    lastName

    OKTA_ATTRIBUTE_EMAIL

    Attribute from user entry that would be treated as email address.

    email

    OKTA_ATTRIBUTE_GROUPS

    Attribute of user’s group list.

    groups

    OKTA_ATTRIBUTE_GROUPNAME

    Attribute of a group’s name.

    name

    OKTA_ATTRIBUTE_ONLY

    Sync only the attributes of users already synced from other services. (true/false)

    false

    E) OKTA Username Attribute Modifications

    OKTA_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL

    Extract the user’s username from an email address. (e.g. username@domain.com -> username) The default is false.

    false

    OKTA_ATTRIBUTE_USERNAME_VALUE_PREFIX

    Prefix to prepend to username. The default is blank.

    OKTA_ATTRIBUTE_USERNAME_VALUE_POSTFIX

    Postfix to append to the username. The default is blank.

    OKTA_ATTRIBUTE_USERNAME_VALUE_TOLOWER

    Convert the user’s username to lowercase. The default is false.

    false

    OKTA_ATTRIBUTE_USERNAME_VALUE_TOUPPER

    Convert the user’s username to uppercase. The default is false.

    false

    OKTA_ATTRIBUTE_USERNAME_VALUE_REGEX

    Attribute to replace username to matching regex. The default is blank.

    F) OKTA Group Name Attribute Modifications

    OKTA_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL

    Extract the group’s name from an email address (e.g. groupname@domain.com -> groupname). The default is false.

    false

    OKTA_ATTRIBUTE_GROUPNAME_VALUE_PREFIX

    Prefix to prepend to the group's name. The default is blank.

    OKTA_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX

    Postfix to append to the group's name. The default is blank.

    OKTA_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER

    Convert group's name to lowercase. The default is false.

    false

    OKTA_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER

    Convert the group's name to uppercase. The default is false.

    false

    OKTA_ATTRIBUTE_GROUPNAME_VALUE_REGEX

    Attribute to replace group's name to matching regex. The default is blank.

  3. Run the following command:

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
Privacera UserSync REST endpoints

When enabled, Privacera UserSync has REST API endpoints available to allow administrators to push users and groups that already exist in the UserSync cache to Privacera Portal.

Push users
POST - <UserSync_Host>:6086/api/pus/public/cache/load/users

The request body should contain a userList and/or ConnectorList. If no users and connectors are passed, all users will be pushed to Ranger.

Example request:

curl -X 'POST' \
  '<UserSync_Host>:6086/api/pus/public/cache/load/users' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "userList": ["User1", "User2"],
    "connectorList": ["AAD1","OKTA"]
}'

Parameter

Type

Description

userList

string array

List of users to be added to Privacera Portal.

connectorList

string array

All users associated with provided connector(s) will be pushed.

Responses:
  • 200 OK

  • 404 Not Found: If one or more Users or Connectors are not found, JSON response contains error message.

Push groups
POST - <UserSync_Host>:6086/api/pus/public/cache/load/groups

The request body should contain a groupList and/or connectorList. If no groups and connectors are passed, all users will be pushed to Ranger.

Example request:

curl -X 'POST' \
  '<UserSync_Host>:6086/api/pus/public/cache/load/groups' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "groupList": ["Group1", "Group2"],
    "connectorList": ["AAD1","OKTA"]
}'

Parameter

Type

Description

groupList

string array

List of groups to be added to Privacera Portal.

connectorList

string array

All groups associated with provided connector(s) will be pushed.

Responses:
  • 200 OK

  • 404 Not Found: If one or more Groups or Connectors are not found, JSON response contains error message.

Migration from Apache Ranger UserSync to Privacera UserSync

Privacera generally recommends using its own version of UserSync (called Privacera UserSync) over the open-source Apache Ranger UserSync. Privacera has rewritten the Ranger UserSync to improve performance and features.

By default, all PrivaceraCloud customers are provisioned to use Privacera Usersync for improved performance capabilities and feature availability over Ranger UserSync. Below are the steps for platform customers to migrate.

All customers must migrate to use Privacera Usersync by March 31, 2024.

Migration steps

For Privacera Platform customers seeking to transition from Apache Ranger UserSync to Privacera UserSync, there are required manual steps to change the configuration.

  1. Navigate to the privacera-manager/config/custom-vars folder.

    cd privacera-manager/config/custom-vars 
  2. Rename the vars.usersync.ldaps.yml file to have a different extension (e.g. vars.usersync.ldaps.yml.bak).

  3. Ensure that the Ranger UserSync POD/Image has stopped.

    ./privacera_manager.sh stop usersync
  4. Copy the following files:

    • ../sample-vars/vars.privacera-usersync.yml

    • ../sample-vars/vars.privacera-usersync.ldap.yml

  5. Edit the vars.privacera-usersync.ldap.yml file with the desired configurations.

    Ranger UserSync Variable

    Privacera UserSync Variable

    USERSYNC_SYNC_LDAP_URL

    LDAP_URL

    USERSYNC_SYNC_LDAP_BIND_DN

    LDAP_BIND_DN

    USERSYNC_SYNC_LDAP_BIND_PASSWORD

    LDAP_BIND_PASSWORD

    USERSYNC_SYNC_LDAP_SEARCH_BASE

    LDAP_SEARCH_BASE

    USERSYNC_SYNC_LDAP_USER_SEARCH_BASE

    LDAP_SEARCH_USER_BASE

    USERSYNC_SYNC_LDAP_USER_SEARCH_FILTER

    LDAP_SEARCH_USER_FILTER

    USERSYNC_SYNC_GROUP_SEARCH_BASE

    LDAP_SEARCH_GROUP_BASE

    USERSYNC_SYNC_LDAP_GROUP_SEARCH_FILTER

    LDAP_SEARCH_GROUP_FILTER

    USERSYNC_SYNC_LDAP_OBJECT_CLASS

    LDAP_OBJECT_USER_CLASS

    USERSYNC_SYNC_GROUP_OBJECT_CLASS

    LDAP_OBJECT_GROUP_CLASS

    USERSYNC_SYNC_LDAP_SSL_ENABLED

    PRIVACERA_USERSYNC_SYNC_LDAP_SSL_ENABLED

    USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS

    PRIVACERA_USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS

  6. Run PM update to deploy Privacera-UserSync:

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update

For more information, see Privacera UserSync.

LDAP/LDAP-S
LDAP / LDAP-S

This topic covers how you can configure the Privacera Platform to attach and import users and groups defined in an external Active Directory (AD), LDAP, or LDAPS (LDAP over SSL)) directory as data access users and groups.

Prerequisites

Before starting these steps, prepare the following. You need to configure various Privacera properties with these values, as detailed in Configuration.

Determine the following LDAP values:

  • The FQDN and protocol (http or https) of your LDAP server

  • DN

  • Complete Bind DN

  • Bind DN password

  • Top-level search base

  • User search base

To configure an SSL-enabled LDAP-S server, Privacera requires an SSL certificate. You have these alternatives:

  • Set the Privacera property USERSYNC_SYNC_LDAP_SSL_ENABLED: "true".

  • Allow Privacera Manager to download and create the certificate based on the LDAP-S server URL. Set the Privacera property USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS: "true".

  • Manually configure a truststore on the Privacera server that contains the certificate of the LDAP-S server. Set the Privacera property USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS: "false".

Configuration
  1. SSH to instance as ${USER}.

  2. Run the following commands. See Access Manager LDAP-related properties and descriptions.

    USERSYNC_SYNC_LDAP_URL: "<PLEASE_CHANGE>"
    USERSYNC_SYNC_LDAP_BIND_DN: "<PLEASE_CHANGE>"
    USERSYNC_SYNC_LDAP_BIND_PASSWORD: "<PLEASE_CHANGE>"
    USERSYNC_SYNC_LDAP_SEARCH_BASE: "<PLEASE_CHANGE>"
    USERSYNC_SYNC_LDAP_USER_SEARCH_BASE: "<PLEASE_CHANGE>"
    USERSYNC_SYNC_LDAP_SSL_ENABLED: "true"
    USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS: "true"
    
  3. Run Privacera Manager update.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration Properties

Property

Description

Example

USERSYNC_SYNC_LDAP_URL

"ldap://dir.ldap.us:389" (when NonSSL)

or

"ldaps://dir.ldap.us:636" (when SSL)

USERSYNC_SYNC_LDAP_BIND_DN

CN=Bind User,OU=example,DC=ad,DC=example,DC=com

USERSYNC_SYNC_LDAP_BIND_PASSWORD

USERSYNC_SYNC_LDAP_SEARCH_BASE

OU=example,DC=ad,DC=example,DC=com

USERSYNC_SYNC_LDAP_USER_SEARCH_BASE

USERSYNC_SYNC_LDAP_SSL_ENABLED

Set this to true if SSL is enabled on the LDAP server.

true

USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS

Set this to true if you want Privacera Manager to generate the truststore certificate.

Set this to false if you want to manually provide the truststore certificate. To learn how to upload SSL certificates, [click here](../pm-ig/upload_custom_cert.md).

true

Azure Active Directory (AAD)
Azure Active Directory - Data Access User Synchronization

This topic covers how you can synchronize users, groups, and service principals from your existing Azure Active Directory (AAD) domain.

Pre-requisites

Ensure the following pre-requisites are met:

  • Create an Azure AD application.

  • Get the values for the following Azure properties: Application (client) ID, Client secrets

CLI Configuration
  1. SSH to the instance as ${USER}.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.usersync.azuread.yml config/custom-vars/
    vi config/custom-vars/vars.usersync.azuread.yml
    
  3. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    USERSYNC_AZUREAD_TENANT_ID: "<PLEASE_CHANGE>"
    USERSYNC_AZUREAD_CLIENT_ID: "<PLEASE_CHANGE>"
    USERSYNC_AZUREAD_CLIENT_SECRET: "<PLEASE_CHANGE>"
    USERSYNC_AZUREAD_DOMAINS: "<PLEASE_CHANGE>"
    USERSYNC_AZUREAD_GROUPS: "<PLEASE_CHANGE>"
    USERSYNC_ENABLE: "true"
    USERSYNC_SOURCE: "azuread"
    USERSYNC_AZUREAD_USE_GROUP_LOOKUP_FIRST: "true"
    USERSYNC_SYNC_AZUREAD_USERNAME_RETRIVAL_FROM: "userPrincipalName"
    USERSYNC_SYNC_AZUREAD_EMAIL_RETRIVAL_FROM: "userPrincipalName"
    USERSYNC_SYNC_AZUREAD_GROUP_RETRIVAL_FROM: "displayName"
    SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_ENABLED: "false"
    SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_USERNAME_RETRIVAL_FROM: "appId"
    
  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration Properties

Property Name

Description

Example

USERSYNC_AZUREAD_TENANT_ID

To get the value for this property, Go to Azure portal > Azure Active Directory > Properties > Tenant ID

5a5cxxx-xxxx-xxxx-xxxx-c3172b33xxxx

USERSYNC_AZUREAD_CLIENT_ID

Get the value by following the Pre-requisites section above.

8a08xxxx-xxxx-xxxx-xxxx-6c0c95a0xxxx

USERSYNC_AZUREAD_CLIENT_SECRET

Get the value by following the Pre-requisites section above.

${CLIENT_SECRET}

USERSYNC_AZUREAD_DOMAINS

To get the value for this property, Go to Azure portal > Azure Active Directory > Domains

componydomain1.com,componydomain2.com

USERSYNC_AZUREAD_GROUPS

To get the value for this property, Go to Azure portal > Azure Active Directory > Groups

GROUP1,GROUP2",GROUP3

USERSYNC_ENABLE

Set to true to enable usersync.

true

USERSYNC_SOURCE

Source from which users/groups are synced.

Values: unix, ldap, azuread

azuread

USERSYNC_AZUREAD_USE_GROUP_LOOKUP_FIRST

Set to true if you want to first sync all groups and then all the users within those groups.

true

USERSYNC_SYNC_AZUREAD_USERNAME_RETRIVAL_FROM

Azure provides the user info in a JSON format.

Assign a JSON attribute that is unique. This would be the name of the user in Ranger.

userPrincipalName

USERSYNC_SYNC_AZUREAD_EMAIL_RETRIVAL_FROM

Azure provides the user info in a JSON format.

Set the email from the JSON attribute of the Azure user entity.

userPrincipalName

USERSYNC_SYNC_AZUREAD_GROUP_RETRIVAL_FROM

Azure provides the user info in a JSON format.

Use the JSON attribute to retrieve group information for the user.

displayName

SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_ENABLED

Set to true to sync Azure service principal to the Ranger user entity

false

SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_USERNAME_RETRIVAL_FROM

Azure provides the service principal info in a JSON format.

Assign a JSON attribute that is unique. This would be the name of the user in Ranger.

appId

Privacera Plugin
Databricks
Privacera Plugin in Databricks
Databricks

Privacera provides two types of plugin solutions for access control in Databricks clusters. Both plugins are mutually exclusive and cannot be enabled on the same cluster.

Databricks Spark Fine-Grained Access Control (FGAC) Plugin

  • Recommended for SQL, Python, R language notebooks.

  • Provides FGAC on databases with row filtering and column masking features.

  • Uses privacera_hive, privacera_s3, privacera_adls, privacera_files services for resource-based access control, and privacera_tag service for tag-based access control.

  • Uses the plugin implementation from Privacera.

Databricks Spark Object Level Access Control (OLAC) PluginConfigure Databricks Spark Object-level Access Control Plugin

OLAC plugin was introduced to provide an alternative solution for Scala language clusters, since using Scala language on Databricks Spark has some security concerns.

  • Recommended for Scala language notebooks.

  • Provides OLAC on S3 locations which you are trying to access via Spark.

  • Uses privacera_s3 service for resource-based access control and privacera_tag service for tag-based access control.

  • Uses the signed-authorization implementation from Privacera.

Databricks cluster deployment matrix with Privacera plugin

Job/Workflow use-case for automated cluster:

Run-Now will create the new cluster based on the definition mentioned in the job description.

Table 1. 

Job Type  

Languages

FGAC/DBX version

OLAC/DBX Version

Notebook

Python/R/SQL

Supported [7.3, 9.1 , 10.4]

JAR

Java/Scala

Not supported

Supported[7.3, 9.1 , 10.4]

spark-submit

Java/Scala/Python

Not supported

Supported[7.3, 9.1 , 10.4]

Python

Python

Supported [7.3, 9.1 , 10.4]

Python wheel

Python

Supported [9.1 , 10.4]

Delta Live Tables pipeline

Not supported

Not supported



Job on existing cluster:

Run-Now will use the existing cluster which is mentioned in the job description.

Table 2. 

Job Type

Languages

FGAC/DBX version

OLAC

Notebook

Python/R/SQL

supported [7.3, 9.1 , 10.4]

Not supported

JAR

Java/Scala

Not supported

Not supported

spark-submit

Java/Scala/Python

Not supported

Not supported

Python

Python

Not supported

Not supported

Python wheel

Python

supported [9.1 , 10.4]

Not supported

Delta Live Tables pipeline

Not supported

Not supported



Interactive use-case

Interactive use-case is running a notebook of SQL/Python on an interactive cluster.

Table 3. 

Cluster Type

Languages

FGAC

OLAC

Standard clusters

Scala/Python/R/SQL

Not supported

Supported [7.3,9.1,10.4]

High Concurrency clusters

Python/R/SQL

Supported [7.3,9.1,10.4

Supported [7.3,9.1,10.4]

Single Node

Scala/Python/R/SQL

Not supported

Supported [7.3,9.1,10.4]



Databricks Spark Fine-Grained Access Control Plugin [FGAC] [Python, SQL]
Configuration
  1. Run the following commands:

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.databricks.plugin.yml config/custom-vars/
    vi config/custom-vars/vars.databricks.plugin.yml
    
  2. Edit the following properties to allow Privacera Platform to connect to your Databricks host. For property details and description, refer to the Configuration Properties below.

    DATABRICKS_HOST_URL: "<PLEASE_UPDATE>"
    DATABRICKS_TOKEN: "<PLEASE_UPDATE>"
    DATABRICKS_WORKSPACES_LIST:
    - alias: DEFAULT
    databricks_host_url: "{{DATABRICKS_HOST_URL}}"
    token: "{{DATABRICKS_TOKEN}}"
    DATABRICKS_MANAGE_INIT_SCRIPT: "true"
    DATABRICKS_ENABLE: "true"
    

    You can also add custom properties that are not included by default. .

  3. Run the following commands:

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
  4. (Optional) By default, policies under the default service name, privacera_hive, are enforced. You can customize a different service name and enforce policies defined in the new name. See Configure Service Name for Databricks Spark Plugin.

Configuration properties

Property Name

Description

Example Values

DATABRICKS_HOST_URL

Enter the URL where the Databricks environment is hosted.

For AZURE Databricks,

DATABRICKS_HOST_URL: "https://xdx-66506xxxxxxxx.2.azuredatabricks.net/?o=665066931xxxxxxx"

For AWS Databricks

DATABRICKS_HOST_URL: "https://xxx-7xxxfaxx-xxxx.cloud.databricks.com"

DATABRICKS_TOKEN

Enter the token.

To generate the token,

1. Login to your Databricks account.

2. Click the user profile icon in the upper right corner of your Databricks workspace.

3. Click User Settings.

4. Click the Generate New Token button.

5. Optionally enter a description (comment) and expiration period.

6. Click the Generate button.

7. Copy the generated token.

DATABRICKS_TOKEN: "xapid40xxxf65xxxxxxe1470eayyyyycdc06"

DATABRICKS_WORKSPACES_LIST

Add multiple Databricks workspaces to connect to Ranger.

  1. To add a single workspace, add the following default JSON in the text area to define the host URL and token of the Databricks workspace. The text area should not be left empty and should at least contain the default JSON.

    Note

    Do not edit any of the values in the default JSON.

    [{"alias":"DEFAULT",
    "databricks_host_url":"{{DATABRICKS_HOST_URL}}",
    "token":"{{DATABRICKS_TOKEN}}"}]
    
  2. To add two workspaces, use the following JSON.

    Note

    {{var}} is an Ansible variable. Such a variable re-uses the value of a predefined variable. Hence, do not edit the properties, databricks_host_url and token of the alias: DEFAULT as they are set by DATABRICKS_HOST_URL and DATABRICKS_TOKEN respectively.

    [{"alias":"DEFAULT",
    "databricks_host_url":"{{DATABRICKS_HOST_URL}}",
    "token":"{{DATABRICKS_TOKEN}}"},
    {"alias":"<workspace-2-alias>","databricks_host_url":"<workspace-2-url>",
    "token":"<dbx-token-for-workspace-2>"}]
    

DATABRICKS_ENABLE

If set to 'true' Privacera Manager will create the Databricks cluster Init script "ranger_enable.sh" to:

'~/privacera/privacera-manager/output/databricks/ranger_enable.sh.

"true"

"false"

DATABRICKS_MANAGE_INIT_SCRIPT

If set to 'true' Privacera Manager will upload Init script ('ranger_enable.sh') to the identified Databricks Host.

If set to 'false' upload the following two files to the DBFS location. The files can be located at *~/privacera/privacera-manager/output/databricks*.

  • privacera_spark_plugin_job.conf

  • privacera_spark_plugin.conf

"true"

"false"

DATABRICKS_SPARK_PLUGIN_AGENT_JAR

Use the Java agent to assign a string of extra JVM options to pass to the Spark driver.

-javaagent:/databricks/jars/privacera-agent.jar

DATABRICKS_SPARK_PRIVACERA_CUSTOM_CURRENT_USER_UDF_NAME

Property to map logged-in user to Ranger user for row-filter policy.

It is mapped with the Databricks cluster-level property spark.hadoop.privacera.custom.current_user.udf.names. See Spark Properties. Check if this property is set in your Databricks cluster. If it is being used, then set its value similar to the PM property. If the value of the PM property and Databricks cluster-level property differ, then it can cause an unexpected behavior.

current_user()

DATABRICKS_SPARK_PRIVACERA_VIEW_LEVEL_MASKING_ROWFILTER_EXTENSION_ENABLE

Property to enable masking, row-filter and data_admin access on view.

Property to enable masking, row-filter and data_admin access on view. This property is a Privacera Manager (PM) property

It is mapped with the Databricks cluster-level property spark.hadoop.privacera.spark.view.levelmaskingrowfilter.extension.enable. See Spark Properties. Check if this property is set in your Databricks cluster. If it is being used, then set its value similar to the PM property. If the value of the PM property and Databricks cluster-level property differ, then it can cause an unexpected behavior.

false

DATABRICKS_SQL_CLUSTER_POLICY_SPARK_CONF

Configure Databricks Cluster policy.

Add the following JSON in the text area:

[{"Note":"First spark conf","key":"spark.hadoop.first.spark.test","value":"test1"},{"Note":"Second spark conf","key":"spark.hadoop.first.spark.test","value":"test2"}]

DATABRICKS_POST_PLUGIN_COMMAND_LIST

This property is not part of the default YAML file, but can be added if required.

Use this property, if you want to run a specific set of commands in the Databricks init script.

The following example will be added to the cluster init script to allow Athena JDBC via data access server.

DATABRICKS_POST_PLUGIN_COMMAND_LIST:

- sudo iptables -I OUTPUT 1 -p tcp -m tcp --dport 8181 -j ACCEPT

- sudo curl -k -u user:password {{PORTAL_URL}}/api/dataserver/cert?type=dataserver_jks -o /etc/ssl/certs/dataserver.jks

- sudo chmod 755 /etc/ssl/certs/dataserver.jks

DATABRICKS_SPARK_PYSPARK_ENABLE_PY4J_SECURITY

This property allows you to backlist APIs to enable security. This property is a Privacera Manager (PM) property

It is mapped with the Databricks cluster-level property spark.databricks.pyspark.enablePy4JSecurity. See Spark Properties. Check if this property is set in your Databricks cluster. If it is being used, then set its value similar to the PM property. If the value of the PM property and Databricks cluster-level property differ, then it can cause an unexpected behavior.

The following example will be added to the cluster init script to allow Athena JDBC via data access server.

DATABRICKS_POST_PLUGIN_COMMAND_LIST:

- sudo iptables -I OUTPUT 1 -p tcp -m tcp --dport 8181 -j ACCEPT

- sudo curl -k -u user:password {{PORTAL_URL}}/api/dataserver/cert?type=dataserver_jks -o /etc/ssl/certs/dataserver.jks

- sudo chmod 755 /etc/ssl/certs/dataserver.jks

Managing init script

Automatic upload

If DATABRICKS_ENABLE is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT is 'true', then the Init script will be uploaded automatically to your Databricks host. The init script will be uploaded to dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable.sh where <DEPLOYMENT_ENV_NAME> is the value of DEPLOYMENT_ENV_NAME mentioned in vars.privacera.yml.

Manual upload

If DATABRICKS_ENABLE is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT is 'false', then the Init script must be uploaded to your Databricks host.

To avoid the manual steps below, you should set DATABRICKS_MANAGE_INIT_SCRIPT=true and follow the instructions outlined in Automatic Upload.

  1. Open a terminal and connect to Databricks account using your Databricks login credentials/token.

    Connect using login credentials:

    1. If you're using login credentials, then run the following command:

      databricks configure --profile privacera
    2. Enter the Databricks URL:

      Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
    3. Enter the username and password:
      Username: email-id@example.com
      Password:

    Connect using Databricks token:

    1. If you don't have a Databricks token, you can generate one. For more information, refer Generate a personal access token.

    2. If you're using token, then run the following command:

      databricks configure --token --profile privacera
    3. Enter the Databricks URL:

      Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
    4. Enter the token:

      Token:
  2. To check if the connection to your Databricks account is established, run the following command:

    dbfs ls dbfs:/ --profile privacera

    You should see the list of files in the output, if you are connected to your account.

  3. Upload files manually to Databricks:

    1. Copy the following files to DBFS, which are available in the PM host at the location, ~/privacera/privacera-manager/output/databricks:

      • ranger_enable.sh

      • privacera_spark_plugin.conf

      • privacera_spark_plugin_job.conf

      • privacera_custom_conf.zip

    2. Run the following command. For the value of <DEPLOYMENT_ENV_NAME>, you can get it from the file, ~/privacera/privacera-manager/config/vars.privacera.yml.

      export DEPLOYMENT_ENV_NAME=<DEPLOYMENT_ENV_NAME>
      dbfs mkdirs dbfs:/privacera/${DEPLOYMENT_ENV_NAME} --profile privacera
      dbfs cp ranger_enable.sh dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      dbfs cp privacera_spark_plugin.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      dbfs cp privacera_spark_plugin_job.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      dbfs cp privacera_custom_conf.zip dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
    3. Verify the files have been uploaded.

      dbfs ls dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera

      The Init Script will be uploaded to dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable.sh, where <DEPLOYMENT_ENV_NAME> is the value of DEPLOYMENT_ENV_NAME mentioned in vars.privacera.yml.

Configure Databricks Cluster
  1. Once the update completes successfully, log on to the Databricks console with your account and open the target cluster, or create a new target cluster.

  2. Open the Cluster dialog and enter Edit mode.

  3. In the Configuration tab, select Advanced Options > Spark.

  4. Add the following content to the Spark Config edit box. For more information on the Spark config properties, click here.

    New Properties

    Note

    • From Privacera 5.0.6.1 Release onwards, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.

    • For Databricks versions &lt; 7.3, Old Properties should only be used since the versions are in extended support.

    spark.databricks.cluster.profile serverless
    spark.databricks.isv.product privacera
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar
    spark.databricks.repl.allowedLanguages sql,python,r
    

    Old Properties

    spark.databricks.cluster.profile serverless
    spark.databricks.repl.allowedLanguages sql,python,r
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
    spark.databricks.isv.product privacera
    spark.databricks.pyspark.enableProcessIsolation true
  5. In the Configuration tab, in Edit mode, Open Advanced Options (at the bottom of the dialog) and then set init script path. For the <DEPLOYMENT_ENV_NAME> variable, enter the deployment name as defined for the DEPLOYMENT_ENV_NAME variable in the vars.privacera.yml.

    dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable.sh
    
  6. In the Table Access Control section, uncheck Enable table access control and only allow Python and SQL commands and Enable credential passthrough for user-level data access and only allow Python and SQL commands checkboxes.

  7. Save (Confirm) this configuration.

  8. Start (or Restart) the selected Databricks Cluster.

Validation

In order to help evaluate the use of Privacera with Databricks, Privacera provides a set of Privacera Manager 'demo' notebooks. These can be downloaded from Privacera S3 repository using either your favorite browser, or a command line 'wget'. Use the notebook/sql sequence that matches your cluster.

  1. Download using your browser (just click on the correct file for your cluster, below:

    https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPlugin.sql

    If AWS S3 is configured from your Databricks cluster: https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginS3.sql

    If ADLS Gen2 is configured from your Databricks cluster: https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginADLS.sql

    or, if you are working from a Linux command line, use the 'wget' command to download.

    wget https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPlugin.sql -O PrivaceraSparkPlugin.sql

    wget https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginS3.sql -O PrivaceraSparkPluginS3.sql

    wget https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginADLS.sql -O PrivaceraSparkPluginADLS.sql

  2. Import the Databricks notebook:

    1. Log in to the Databricks Console

    2. Select Workspace > Users > Your User.

    3. From the drop down menu, select Import and choose the file downloaded.

  3. Follow the suggested steps in the text of the notebook to exercise and validate Privacera with Databricks.

Databricks Spark Object-level Access Control Plugin [OLAC] [Scala]
Prerequisites

Ensure the following prerequisites are met:

Configuration
  1. Run the following commands.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.databricks.scala.yml config/custom-vars/
    vi config/custom-vars/vars.databricks.scala.yml
    
  2. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    DATASERVER_DATABRICKS_ALLOWED_URLS : "<PLEASE_UPDATE>"
    DATASERVER_AWS_STS_ROLE: "<PLEASE_CHANGE>"
    
  3. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

DATABRICKS_SCALA_ENABLE

Set the property to enable/disable Databricks Scala. This is found under Databricks Signed URL Configuration For Scala Clusters section.

DATASERVER_DATABRICKS_ALLOWED_URLS

Add a URL or comma-separated URLs.

Privacera Dataserver serves only those URLs mentioned in this property.

https://xxx-7xxxfaxx-xxxx.cloud.databricks.com

DATASERVER_AWS_STS_ROLE

Add the instance profile ARN of the AWS role, which can access Delta Files in Databricks.

arn:aws:iam::111111111111:role/assume-role

DATABRICKS_SCALA_CLUSTER_POLICY_SPARK_CONF

Configure Databricks Cluster policy.

Add the following JSON in the text area:

[{"Note":"First spark conf",
"key":"spark.hadoop.first.spark.test",
"value":"test1"},
{"Note":"Second spark conf",
"key":"spark.hadoop.first.spark.test",
"value":"test2"}]
Managing init script

Automatic Upload

If DATABRICKS_ENABLE is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT is "true", the Init script will be uploaded automatically to your Databricks host. The Init Script will be uploaded to dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable_scala.sh, where <DEPLOYMENT_ENV_NAME> is the value of DEPLOYMENT_ENV_NAME mentioned in vars.privacera.yml.

Manual Upload

If DATABRICKS_ENABLE is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT is "false" the Init script must be uploaded to your Databricks host.

  1. Open a terminal and connect to Databricks account using your Databricks login credentials/token.

    • Connect using login credentials:

      1. If you're using login credentials, then run the following command.

        databricks configure --profile privacera
        
      2. Enter the Databricks URL.

        Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
        
      3. Enter the username and password.

        Username: email-id@yourdomain.com
        Password:
        
    • Connect using Databricks token:

      1. If you don't have a Databricks token, you can generate one. For more information, refer Generate a personal access token.

      2. If you're using token, then run the following command.

        databricks configure --token --profile privacera
        
      3. Enter the Databricks URL.

        Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
        
      4. Enter the token.

        Token:
        
  2. To check if the connection to your Databricks account is established, run the following command.

    dbfs ls dbfs:/ --profile privacera
    

    You should see the list of files in the output, if you are connected to your account.

  3. Upload files manually to Databricks.

    1. Copy the following files to DBFS, which are available in the PM host at the location, ~/privacera/privacera-manager/output/databricks:

      • ranger_enable_scala.sh

      • privacera_spark_scala_plugin.conf

      • privacera_spark_scala_plugin_job.conf

    2. Run the following command. For the value of <DEPLOYMENT_ENV_NAME>, you can get it from the file, ~/privacera/privacera-manager/config/vars.privacera.yml.

      export DEPLOYMENT_ENV_NAME=<DEPLOYMENT_ENV_NAME>
      dbfs mkdirs dbfs:/privacera/${DEPLOYMENT_ENV_NAME} --profile privacera
      dbfs cp ranger_enable_scala.sh dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      dbfs cp privacera_spark_scala_plugin.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      dbfs cp privacera_spark_scala_plugin_job.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      
    3. Verify the files have been uploaded.

      dbfs ls dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
      

      The Init Script is uploaded to dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable_scala.sh, where <DEPLOYMENT_ENV_NAME> is the value of DEPLOYMENT_ENV_NAME mentioned in vars.privacera.yml.

Configure Databricks cluster
  1. Once the update completes successfully, log on to the Databricks console with your account and open the target cluster, or create a new target cluster.

  2. Open the Cluster dialog. enter Edit mode.

  3. In the Configuration tab, in Edit mode, Open Advanced Options (at the bottom of the dialog) and then the Spark tab.

  4. Add the following content to the Spark Config edit box. For more information on the Spark config properties, click here.

    New Properties

    spark.databricks.isv.product privacera
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar
    spark.executor.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar
    spark.databricks.repl.allowedLanguages sql,python,r,scala
    spark.databricks.delta.formatCheck.enabled false
    

    Old Properties

    spark.databricks.cluster.profile serverless
    spark.databricks.delta.formatCheck.enabled false
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
    spark.executor.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
    spark.databricks.isv.product privaceraspark.databricks.repl.allowedLanguages sql,python,r,scala
    

    Note

    • From Privacera 5.0.6.1 Release onwards, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.

    • For Databricks versions &lt; 7.3, Old Properties should only be used since the versions are in extended support.

  5. (Optional) To use regional endpoint for S3 access, add the following content to the Spark Config edit box.

    spark.hadoop.fs.s3a.endpoint https://s3.<region>.amazonaws.com
    spark.hadoop.fs.s3.endpoint https://s3.<region>.amazonaws.com
    spark.hadoop.fs.s3n.endpoint https://s3.<region>.amazonaws.com
    
  6. In the Configuration tab, in Edit mode, Open Advanced Options (at the bottom of the dialog) and then set init script path. For the <DEPLOYMENT_ENV_NAME> variable, enter the deployment name as defined for the DEPLOYMENT_ENV_NAME variable in the vars.privacera.yml.

    dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable_scala.sh
    
  7. Save (Confirm) this configuration.

  8. Start (or Restart) the selected Databricks Cluster.

Related information

For further reading, see:

Spark standalone
Privacera plugin in Spark standalone

This section covers how you can use Privacera Manager to generate the setup script and Spark custom configuration for SSL/TSL to install Privacera Plugin in an open-source Spark environment.

The steps outlined below are only applicable to the Spark 3.x version.

Prerequisites

Ensure the following prerequisites are met:

  • A working Spark environment.

  • Privacera services must be up and running.

Configuration
  1. SSH to the instance as USER.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.spark-standalone.yml config/custom-vars/
    vi config/custom-vars/vars.spark-standalone.yml
  3. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    SPARK_STANDALONE_ENABLE:"true"
    SPARK_ENV_TYPE:"<PLEASE_CHANGE>"
    SPARK_HOME:"<PLEASE_CHANGE>"
    SPARK_USER_HOME:"<PLEASE_CHANGE>"
    
  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    

    After the update is complete, the setup script (privacera_setup.sh, standalone_spark_FGAC.sh, standalone_spark_OLAC.sh) and Spark custom configurations (spark_custom_conf.zip) for SSL will be generated at the path, cd ~/privacera/privacera-manager/output/spark-standalone.

  5. You can either enable FGAC or OLAC in your Spark environment.

    Enable FGAC

    To enable Fine-grained access control (FGAC), do the following:

    1. Copy standalone_spark_FGAC.sh and spark_custom_conf.zip. Both the files should be placed under the same folder.

    2. Add permissions to execute the script.

      chmod +x standalone_spark_FGAC.sh
      
    3. Run the script to install the Privacera plugin in your Spark environment.

      ./standalone_spark_FGAC.sh

    Enable OLAC

    To enable Object level access control (OLAC), do the following:

    1. Copy standalone_spark_OLAC.sh and spark_custom_conf.zip. Both the files should be placed under the same folder.

    2. Add permissions to execute the script.

      chmod +x standalone_spark_OLAC.sh
      
    3. Run the script to install the Privacera plugin in your Spark environment.

      ./standalone_spark_OLAC.sh
      
Configuration properties

Property

Description

Example

SPARK_STANDALONE_ENABLE

Property to enable generating setup script and configs for Spark standalone plugin installation.

true

SPARK_ENV_TYPE

Set the environment type. It can be any user-defined type.

For example, if you're working in an environment that runs locally, you can set the type as local; for a production environment, set it as prod.

local

SPARK_HOME

Home path of your Spark installation.

~/privacera/spark/spark-3.1.1-bin-hadoop3.2

SPARK_USER_HOME

User home directory of your Spark installation.

/home/ec2-user

SPARK_STANDALONE_RANGER_IS_FALLBACK_SUPPORTED

Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.

To enable the fallback, set to true; to disable, set to false.

true

Validations

To verify the successful installation of Privacera plugin, do the following:

  1. Create an S3 bucket ${S3_BUCKET} for sample testing.

  2. Download sample data using the following link and put it in the ${S3_BUCKET} at location (s3://${S3_BUCKET}/customer_data).

    wget https://privacera-demo.s3.amazonaws.com/data/uploads/customer_data_clear/customer_data_without_header.csv
    
  3. (Optional) Add AWS JARS in Spark. Download the JARS according to the version of Spark Hadoop in your environment.

    cd  <SPARK_HOME>/jars
    

    For Spark-3.1.1 - Hadoop 3.2 version,

    wget https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.2.0/hadoop-aws-3.2.0.jar
    wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.375/aws-java-sdk-bundle-1.11.375.jar
    
  4. Run the following command.

    cd <SPARK_HOME>/bin
    
  5. Run the spark-shell to execute scala commands.

    ./spark-shell
    
Validations with JWT Token
  1. Run the following command.

    cd <SPARK_HOME>/bin
    
  2. Set the JWT_TOKEN.

    JWT_TOKEN="<JWT_TOKEN>"
  3. Run the following command to start spark-shell with parameters.

    ./spark-shell --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"  --conf "spark.hadoop.privacera.jwt.oauth.enable=true"
Validations with JWT token and public key
  1. Create a local file with the public key, if the JWT token is generated by private/public key combination.

  2. Set the following according to the payload of JWT Token.

    JWT_TOKEN="<JWT_TOKEN>"
    #The following variables are optional, set it only if token has it else set it empty
    JWT_TOKEN_ISSUER="<JWT_TOKEN_ISSUER>"
    JWT_TOKEN_PUBLIC_KEY_FILE="<JWT_TOKEN_PUBLIC_KEY_FILE_PATH>"
    JWT_TOKEN_USER_KEY="<JWT_TOKEN_USER_KEY>"
    JWT_TOKEN_GROUP_KEY="<JWT_TOKEN_GROUP_KEY>"
    JWT_TOKEN_PARSER_TYPE="<JWT_TOKEN_PARSER_TYPE>"
  3. Run the following command to start spark-shell with parameters.

    ./spark-shell 
    --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}" 
    --conf "spark.hadoop.privacera.jwt.oauth.enable=true" 
    --conf "spark.hadoop.privacera.jwt.token.publickey=${JWT_TOKEN_PUBLIC_KEY_FILE}" 
    --conf "spark.hadoop.privacera.jwt.token.issuer=${JWT_TOKEN_ISSUER}"
    --conf "spark.hadoop.privacera.jwt.token.parser.type=${JWT_TOKEN_PARSER_TYPE}" 
    --conf "spark.hadoop.privacera.jwt.token.userKey=${JWT_TOKEN_USER_KEY}" 
    --conf "spark.hadoop.privacera.jwt.token.groupKey=${JWT_TOKEN_GROUP_KEY}"
Use cases
  1. Add a policy in Access Manager with read permission to ${S3_BUCKET}.

    val file_path = "s3a://${S3_BUCKET}/customer_data/customer_data_without_header.csv"
    val df=spark.read.csv(file_path)
    df.show(5)
    
  2. Add a policy in Access Manager with delete and write permission to ${S3_BUCKET}.

    df.write.format("csv").mode("overwrite").save("s3a://${S3_BUCKET}/csv/customer_data.csv")
    
Spark on EKS
Privacera plugin in Spark on EKS

This section covers how you can use Privacera Manager to generate the setup script and Spark custom configuration for SSL to install the Privacera plugin in Spark on an EKS cluster.

Prerequisites

Ensure the following prerequisites are met:

Configuration
  1. SSH to the instance as USER.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.spark-standalone.yml config/custom-vars/
    vi config/custom-vars/vars.spark-standalone.yml
    
  3. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    SPARK_STANDALONE_ENABLE:"true"
    SPARK_ENV_TYPE:"<PLEASE_CHANGE>"
    SPARK_HOME:"<PLEASE_CHANGE>"
    SPARK_USER_HOME:"<PLEASE_CHANGE>"
  4. Run the following commands:

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    

    After the update is complete, the Spark custom configuration (spark_custom_conf.zip) for SSL will be generated at the path, cd ~/privacera/privacera-manager/output/spark-standalone.

  5. Create the Spark Docker Image

    1. Run the following commands to export PRIVACERA_BASE_DOWNLOAD_URL:

      exportPRIVACERA_BASE_DOWNLOAD_URL=<PRIVACERA_BASE_DOWNLOAD_URL>
      
    2. Create a folder.

      mkdir -p ~/privacera-spark-plugin
      cd ~/privacera-spark-plugin
      
    3. Download and extract package using wget.

      wget ${PRIVACERA_BASE_DOWNLOAD_URL}/spark-plugin/k8s-spark-pkg.tar.gz -O k8s-spark-pkg.tar.gz
      tar xzf k8s-spark-pkg.tar.gz
      rm -r k8s-spark-pkg.tar.gz
      
    4. Copy spark_custom_conf.zip file from the Privacera Manager output folder into the files folder.

      cp ~/privacera/privacera-manager/output/spark-standalone/spark_custom_conf.zip files/spark_custom_conf.zip
      
    5. You can either built OLAC Docker image or FGAC Docker image.

      OLAC

      To built the OLAC Docker image, use the following command:

      ./build_image.sh ${PRIVACERA_BASE_DOWNLOAD_URL} OLAC
      

      FGAC

      To built the FGAC Docker image, use the following command:

      ./build_image.sh ${PRIVACERA_BASE_DOWNLOAD_URL} FGAC
      
  6. Test the Spark Docker image.

    1. Create a S3 bucket ${S3_BUCKET} for sample testing.

    2. Download sample data using the following link and put it in the ${S3_BUCKET} at location (s3://${S3_BUCKET}/customer_data).

      wget https://privacera-demo.s3.amazonaws.com/data/uploads/customer_data_clear/customer_data_without_header.csv
      
    3. Start Docker in an interactive mode.

      IMAGE=privacera-spark-plugin:latest
      docker run  --rm -i -t ${IMAGE} bash
      
    4. Start spark-shell inside the Docker container.

      JWT_TOKEN="<PLEASE_CHANGE>"
      cd /opt/privacera/spark/bin
      ./spark-shell \
      --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"\
      --conf "spark.hadoop.privacera.jwt.oauth.enable=true"
    5. Run the following command to read the S3 file:

      val df= spark.read.csv("s3a://${S3_BUCKET}/customer_data/customer_data_without_header.csv")
    6. Exit the Docker shell.

      exit
  7. Publish the Spark Docker Image into your Docker Registry.

    • For HUB, HUB_USERNAME, and HUB_PASSWORD, use the Docker hub URL and login credentials.

    • For ENV_TAG, its value can be user-defined depending on your deployment environment such as development, production or test. For example, ENV_TAG=dev can be used for a development environment.

    HUB=<PLEASE_CHANGE>
    HUB_USERNAME=<PLEASE_CHANGE>
    HUB_PASSWORD=<PLEASE_CHANGE>
    ENV_TAG=<PLEASE_CHANGE>
    DEST_IMAGE=${HUB}/privacera-spark-plugin:${ENV_TAG}
    SOURCE_IMAGE=privacera-spark-plugin:latest
    docker login -u ${HUB_USERNAME} -p ${HUB_PASSWORD}${HUB}
    docker tag ${SOURCE_IMAGE}${DEST_IMAGE}
    docker push ${DEST_IMAGE}
  8. Deploy Spark Plugin on EKS cluster.

    1. SSH to EKS cluster where you want to deploy Spark on EKS cluster.

    2. Run the following commands to export PRIVACERA_BASE_DOWNLOAD_URL:

      exportPRIVACERA_BASE_DOWNLOAD_URL=<PRIVACERA_BASE_DOWNLOAD_URL>
      
    3. Create a folder.

      mkdir ~/privacera-spark-plugin
      cd ~/privacera-spark-plugin
      
    4. Download and extract package using wget.

      wget ${PRIVACERA_DOWNLOAD_URL}/plugin/spark/k8s-spark-deploy.tar.gz -O k8s-spark-deploy.tar.gz
      tar xzf k8s-spark-deploy.tar.gz
      rm -r k8s-spark-deploy.tar.gz
      cd k8s-spark-deploy/
      
    5. Open penv.sh file and substitute the values of the following properties, refer to the table below:

      Property

      Description

      Example

      SPARK_NAME_SPACE

      Kubernetes namespace

      privacera-spark-plugin-test

      SPARK_PLUGIN_ROLE_BINDING

      Spark role Binding

      privacera-sa-spark-plugin-role-binding

      SPARK_PLUGIN_SERVICE_ACCOUNT

      Spark services account

      privacera-sa-spark-plugin

      SPARK_PLUGN_ROLE

      Spark services account role

      privacera-sa-spark-plugin-role

      SPARK_PLUGIN_APP_NAME

      Spark services account role

      privacera-sa-spark-plugin-role

      SPARK_PLUGIN_IMAGE

      Docker image with hub

      myhub.docker.com}/privacera-spark-plugin:prod-olac

      SPARK_DOCKER_PULL_SECRET

      Secret for docker-registry

      spark-plugin-docker-hub

    6. Run the following command to replace the properties value in the Kubernetes deployment .yml file:

      mkdir -p backup
      cp *.yml backup/
      ./replace.sh
      
    7. Run the following command to create Kubernetes resources:

      kubectl apply -f namespace.yml
      kubectl apply -f service-account.yml
      kubectl apply -f role.yml
      kubectl apply -f role-binding.yml
      
    8. Run the following command to create secret for docker-registry:

      kubectl create secret docker-registry spark-plugin-docker-hub --docker-server=<PLEASE_CHANGE> --docker-username=<PLEASE_CHANGE>  --docker-password='<PLEASE_CHANGE>' --namespace=<PLEASE_CHANGE>
      
    9. Run the following command to deploy a sample Spark application:

      Note

      This is an sample file used for deployment. As per your use case, you can create Spark deployment file and deploy a Docker image.

      kubectl apply -f privacera-spark-examples.yml -n ${SPARK_NAME_SPACE}

      This will deploy spark application in Kubernetes pod with Privacera plugin and it will keep the pod running, so that you can use it in interactive mode.

Configuration properties

Property

Description

Example

SPARK_STANDALONE_ENABLE

Property to enable generating setup script and configs for Spark standalone plugin installation.

true

SPARK_ENV_TYPE

Set the environment type. It can be any user-defined type.

For example, if you're working in an environment that runs locally, you can set the type as local; for a production environment, set it as prod.

local

SPARK_HOME

Home path of your Spark installation.

~/privacera/spark/spark-3.1.1-bin-hadoop3.2

SPARK_USER_HOME

User home directory of your Spark installation.

/home/ec2-user

SPARK_STANDALONE_RANGER_IS_FALLBACK_SUPPORTED

Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.

To enable the fallback, set to true; to disable, set to false.

true

Validation
  1. Get all the resources.

    kubectl get all -n ${SPARK_NAME_SPACE}

    Copy POD ID that you will need for spark-master connection.

  2. Get the cluster info.

    kubectl cluster-info
    

    Copy Kubernetes control plane URL from the above output that we need during spark-shell command, for example ( https://xxxxxxxxxxxxxxxxxxxxxxx.yl4.us-east-1.eks.amazonaws.com).

    When using the URL for EKS_SERVER property in step 4, prefix the property value with k8s://. The following is an example of the property:

    EKS_SERVER="k8s://https://xxxxxxxxxxxxxxxxxxxxxxx.yl4.us-east-1.eks.amazonaws.com"
  3. Connect to Kubernetes master node.

    kubectl -n ${SPARK_NAME_SPACE}exec -it  <POD_ID>  -- bash
    
  4. Set the following properties:

    SPARK_NAME_SPACE="<PLEASE_CHANGE>"
    SPARK_PLUGIN_SERVICE_ACCOUNT="<PLEASE_CHANGE>"
    SPARK_PLUGIN_IMAGE="<PLEASE_CHANGE>"
    SPARK_DOCKER_PULL_SECRET="spark-plugin-docker-hub"
    EKS_SERVER="<PLEASE_CHANGE>"
    JWT_TOKEN="<PLEASE_CHANGE>"
  5. Run the following commands to open spark-shell. The command contains all the setup which is required to open the spark-shell.

    cd /opt/privacera/spark/bin
    ./spark-shell --master ${EKS_SERVER}\
    --deploy-mode client \
    --conf spark.kubernetes.authenticate.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\
    --conf spark.kubernetes.namespace=${SPARK_NAME_SPACE}\
    --conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
    --conf spark.kubernetes.authenticate.submission.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\
    --conf spark.kubernetes.container.image=${SPARK_PLUGIN_IMAGE}\
    --conf spark.kubernetes.container.image.pullPolicy=Always \
    --conf spark.kubernetes.container.image.pullSecrets=${SPARK_DOCKER_PULL_SECRET}\
    --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"\
    --conf "spark.hadoop.privacera.jwt.oauth.enable=true"\
    --conf spark.driver.bindAddress='0.0.0.0'\
    --conf spark.driver.host=$SPARK_PLUGIN_POD_IP\
    --conf spark.port.maxRetries=4\
    --conf spark.kubernetes.driver.pod.name=$SPARK_PLUGIN_POD_NAME
  6. Run the following command using spark-submit with JWT authentication.

    ./spark-submit \
    --master ${EKS_SERVER}\
    --name spark-cloud-new \
    --deploy-mode cluster \
    --conf spark.kubernetes.authenticate.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\
    --conf spark.kubernetes.namespace=${SPARK_NAME_SPACE}\
    --conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
    --conf spark.kubernetes.authenticate.submission.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\
    --conf spark.kubernetes.container.image=${SPARK_PLUGIN_IMAGE}\
    --conf spark.kubernetes.container.image.pullPolicy=Always \
    --conf spark.kubernetes.container.image.pullSecrets=${SPARK_DOCKER_PULL_SECRET}\
    --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"\
    --conf spark.driver.bindAddress='0.0.0.0'\
    --conf spark.driver.host=$SPARK_PLUGIN_POD_IP\
    --conf spark.port.maxRetries=4\
    --conf spark.kubernetes.driver.pod.name=$SPARK_PLUGIN_POD_NAME\
    --class com.privacera.spark.poc.SparkSample \
    <your-code-jar/file>
    
  7. To check the read access on the S3 file, run the following command in the open spark-shell:

    val df= spark.read.csv("s3a://${S3_BUCKET}/customer_data/customer_data_without_header.csv")
    df.show()
  8. To check the write access on the S3 file, run the following command in the open spark-shell:

    df.write.format("csv").mode("overwrite").save("s3a://${S3_BUCKET}/output/k8s/sample/csv")
  9. Check the Audit logs on the Privacera Portal.

  10. To verify the spark-shell setup, open another SSH connection for Kubernetes cluster and run the following command to check the running pods:

    kubectl get pods -n ${SPARK_NAME_SPACE}

    You will see the spark executor pods -exec-x. For example, spark-shell-xxxxxxxxxxxxxxxx-exec-1 and spark-shell-xxxxxxxxxxxxxxxx-exec-2.

Portal SSO with PingFederate

Privacera portal leverages PingIdentity’s Platform Portal for authentication via SAML. For this integration, there are configuration steps in both Privacera portal and PingIdentity.

Configuration steps for PingIdentity
  1. Sign in to your PingIdentity account.

  2. Under Your Environments , click Administrators.

  3. Select Connections from the left menu.

  4. In the Applications section, click on the + button to add a new application.

  5. Enter an Application Name (such as Privacera Portal SAML) and provide a description (optionally add an icon). For the Application Type, select SAML Application. Then click Configure.

  6. On the SAML Configuration page, under "Provide Application Metadata", select Manually Enter.

  7. Enter the ACS URLs:

    https://<portal_hostname>:<PORT>/saml/SSO

    Enter the Entity ID:

    privacera-portal

    Click the Save button.

  8. On the Overview page for the new application, click on the Attributes edit button. Add the attribute mapping:

    user.login: Username

    Set as Required.

    Note

    If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.

  9. You can optionally add additional attribute mappings:

    user.email: Email Address 
    user.firstName: Given Name
    user.lastName: Family Name
  10. Click the Save button.

  11. Next in your application, select Configuration and then the edit icon.

  12. Set the SLO Endpoint:

    https://<portal_hostname>:<PORT>/login.html

    Click the Save button.

  13. In the Configuration section, under Connection Details, click on Download Metadata button.

  14. Once this file is downloaded, rename it to:

    privacera-portal-aad-saml.xml

    This file will be used in the Privacera Portal configuration.

Configuration steps in Privacera Portal

Now we will configure Privacera Portal using privacera-manager to use the privacera-portal-aad-saml.xml file created in the above steps.

  1. Run the following commands:

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
  2. Edit the vars.portal.saml.aad.yml file:

    vi config/custom-vars/vars.portal.saml.aad.yml

    Add the following properties:

    SAML_ENTITY_ID: "privacera-portal"
    SAML_BASE_URL: "https://{{app_hostname}}:{port}"
    PORTAL_UI_SSO_ENABLE: "true"
    PORTAL_UI_SSO_URL: "saml/login"
    PORTAL_UI_SSO_BUTTON_LABEL: "Single Sign On"
    AAD_SSO_ENABLE: "true"
  3. Copy the privacera-portal-aad-saml.xml file to the following folder:

    ~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
  4. Edit the vars.portal.yml file:

    cd ~/privacera/privacera-manager/
    vi config/custom-vars/vars.portal.yml

    Add the following properties and assign your values.

    SAML_EMAIL_ATTRIBUTE: "user.email"
    SAML_USERNAME_ATTRIBUTE: "user.login"
    SAML_LASTNAME_ATTRIBUTE: "user.lastName"
    SAML_FIRSTNAME_ATTRIBUTE: "user.firstName"
  5. Run the following to update privacera-manager:

    cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update

    You should now be able to use Single Sign-on to Privacera using PingFederate.

Trino Open Source
Privacera Plugin in Trino Open Source

Learn how you can use Privacera Manager to generate the setup script and Trino custom configuration for SSL to install Privacera Plugin in an open-source Trino environment.

Privacera Trino supports Trino Open Source with the following catalogs:

  • Hive

  • PostgreSQL DB

  • Redshift

Prerequisites
  • A working Trino environment

  • Privacera services must be up and running.

Configuration
  1. SSH to the instance as USER.

  2. Run the following commands:

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.trino.opensource.yml config/custom-vars/
    vi config/custom-vars/vars.trino.opensource.yml
  3. Edit the following properties. For property details and descriptions, see Table 4, “Trino Open Source Properties.

    TRINO_STANDALONE_ENABLE: "true"
    TRINO_USER_HOME: "<PLEASE_CHANGE>"
    TRINO_INSTALL_DIR_NAME: "<PLEASE_CHANGE>"
  4. Run the following commands:

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update

    After the update is complete, the setup script (privacera_trino_setup.sh) and Trino custom configurations (privacera_trino_plugin_conf.zip) for SSL will be generated at the path, cd ~/privacera/privacera-manager/output/trino-opensource/.

  5. In your Trino environment, do the following:

    1. Copy privacera_trino_setup.sh and privacera_trino_plugin_conf.zip. Both the files should be placed under the same folder.

    2. Add permissions to execute the script.

      chmod +x privacera_trino_setup.sh
    3. Run the script to install the Privacera plugin in your Trino environment.

      ./privacera_trino_setup.sh

Note

To learn more about Trino, see Trino User Guide.

Table Properties for Trino Open Source
Table 4. Trino Open Source Properties

Property

Description

Example

TRINO_OPENSOURCE_ENABLETRINO_OPENSOURCE_ENABLE

Property to enable/disable Trino.

true

TRINO_USER_HOME

Property to set the path to the Trino home directory.

/home/ec2-user

TRINO_INSTALL_DIR_NAME

Property to set the path to the directoy where Trino is installed.

/etc/trino

TRINO_RANGER_SERVICE_REPO

Property to indicate Trino Ranger policy.

privacera_trino

TRINO_AUDITS_URL_EXTERNAL

Solr audit URL or audit server URL.

http://10.100.10.10:8983/solr/ranger_audits

TRINO_RANGER_EXTERNAL_URL

This is a Ranger Admin URL.

/etc/trino

XAAUDIT.SOLR.ENABLE

Enable/Disable solr audit. Set the value to true to enable solr audit.

true

TRINO_HIVE_POLICY_AUTHZ_ENABLED

Enable/Disable Hive policy authorization for the Hive catalog.Set the value to true to use Hive policies to authorize hive catalog queries.

true

TRINO_HIVE_POLICY_REPO_CATALOG_MAPPING

Indicates Hive policy repository and Hive catalog mapping.

Use the following format:

{hive_policy_repo-1}:{comma_separated_hive_catalogs};{hive_policy_repo-2}:{comma_separated_hive_catalogs}

privacera_hive:hiveprivacera_hive:hivecatalog1,

TRINO_RANGER_AUTH_ENABLED

Set the value to true to disable authorization for show catalog query.

true



Migrating from PrestoSQL to Trino

To migrate your existing policies from PrestoSQL to Trino, see Migrating Steps.

Dremio
Introduction

This section covers how you can integrate Dremio with Privacera. You can use Dremio for table-level access control with the native Ranger plugin.

By integrating Dremio with Privacera, you'll be provided with comprehensive data lake security and fine-grained access control across multi-cloud environments. Dremio works directly with data lake storage. Using Dremio's query engine and ability to democratize data access, Privacera implements fine-grained access control policies, then automatically enforces and audits them at enterprise scale.

Dremio is supported with the following data sources:

  • S3

  • ADLS

  • Hive

  • Redshift

Prerequisites

Ensure the following prerequisites are met:

  • A Privacera Manager host where Privacera services are running.

  • A Dremio host where Dremio Enterprise Edition is installed. (The Community Edition is not supported.)

Configuration

To configure Dremio:

Note

There are limitations in the Dremio native Hive plugin because Dremio uses Ranger 1.1.0.

  • Audit Server basic auth needs to be disabled because it's not supported.

  • Dremio does not support solr audits in SSL if it is enabled in the audit server.

  1. Run the following commands:

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.dremio.yml config/custom-vars/
    
  2. Update the following properties:

    AUDITSERVER_ENABLE: "true"
    AUDITSERVER_AUTH_TYPE: "none"
    AUDITSERVER_SSL_ENABLE: "false"
  3. Run the following commands to configure the audit server for Dremio Native Hive Ranger Based authorization..

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.auditserver.yml config/custom-vars/ 
    vi config/custom-vars/vars.auditserver.yml

    After the update is completed, the Dremio plugin installation script privacera_dremio.sh and custom configuration archive privacera_custom_conf.tar.gz is generated at the location ~/privacera/privacera-manager/output/dremio

  4. Configure Privacera plugin depending on how you have installed Dremio in your instance.

    For a new or existing data source configured in Dremio Data Lake, ensure Enable external authorization plugin checkbox under Settings > Advanced Options of the data source is selected in the Dremio UI.

  5. Restart the Dremio service.

Kubernetes

Depending on your cloud provider, you can set up Dremio in a Kubernetes container. For more information, see the following links.

After setting up Dremio, perform the following steps to deploy Privacera plugin. The steps assume that your Privacera Manager host instance is separate from your Dremio Kubernetes instance. If they are configured on the single instance, then modify the steps accordingly.

  1. SSH to your instance where Dremio is installed containing the Dremio Kubernetes artifacts and change to the dremio-cloud-tools/charts/dremio_v2/ directory.

  2. Copy the privacera_dremio.sh and privacera_custom_conf.tar.gz files from your Privacera Manager host instance to the dremio_v2 folder in your Dremio Kubernetes instance.

  3. Run the following commands:

    mkdir -p privacera_config 
    mv privacera_dremio.sh privacera_config/ 
    mv privacera_custom_conf.tar.gz privacera_config/
  4. Update configmap.yml to add new configmap for Privacera configuration.

    vi templates/dremio-configmap.yaml
  5. Add the following configuration at the start of the file:

    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: dremio-privacera-install
    data:
    privacera_dremio.sh: |- {{ .Files.Get "privacera_config/privacera_dremio.sh" | nindent 4 }}
    binaryData:
    privacera_custom_conf.tar.gz: {{ .Files.Get "privacera_config/privacera_custom_conf.tar.gz" | b64enc | nindent 4 }}
    ---
  6. Update dremio-env to add Privacera jars and configuration in the Dremio classpath.

    vi config/dremio-env
  7. Add the following variable, or update it if it already exists:

    DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*
  8. Update values.yaml.

    vi values.yaml
            
  9. Add the following configuration for extraInitContainers inside the coordinator section:

    extraInitContainers:  |
        - name: install-privacera-dremio-plugin
        image: {{.Values.image}}:{{.Values.imageTag}}
        imagePullPolicy: IfNotPresent
        securityContext:
            runAsUser: 0
        volumeMounts:
        - name: dremio-privacera-plugin-volume
            mountPath: /opt/dremio/plugins/authorizer
        - name: dremio-ext-jars-volume
            mountPath: /opt/privacera/dremio-ext-jars
        - name: dremio-privacera-config
            mountPath: /opt/privacera/conf/
        - name: dremio-privacera-install
            mountPath: /opt/privacera/install/
        command:
            - "bash"
            - "-c"
            - "cd /opt/privacera/install/ && cp * /tmp/ && cd /tmp && ./privacera_dremio.sh"
  10. Update or uncomment the extraVolumes section inside the coordinator section and add the following configuration:

    extraVolumes:
    - name: dremio-privacera-install
        configMap:
        name: dremio-privacera-install
        defaultMode: 0777
    - name: dremio-privacera-plugin-volume
        emptyDir: {}
    - name: dremio-ext-jars-volume
        emptyDir: {}
    - name: dremio-privacera-config
        emptyDir: {}
  11. Update or uncomment the extraVolumeMounts section inside the coordinator section and add the following configuration:

    extraVolumeMounts:
    - name: dremio-ext-jars-volume
        mountPath: /opt/privacera/dremio-ext-jars
    - name: dremio-privacera-plugin-volume
        mountPath: /opt/dremio/plugins/authorizer
    - name: dremio-privacera-config
        mountPath: /opt/privacera/conf
  12. Upgrade your Helm release. Get the release name by running helm list command. The text under the Name column is your Helm release.

    helm upgrade -f values.yaml <release-name>
RPM

To deploy RPM:

  1. SSH to your instance where Dremio RPM is installed.

  2. Copy the privacera_dremio.sh and privacera_custom_conf.tar.gz files from your Privacera Manager host instance to the Home folder in your Dremio instance.

  3. Rum the following commands:

    mkdir -p ~/privacera/install
    mv privacera_dremio.sh ~/privacera/install
    mv privacera_custom_conf.tar.gz ~/privacera/install
  4. Launch the privacera_dremio.sh script.

    cd ~/privacera/install
    chmod +x privacera_dremio.sh
    sudo ./privacera_dremio.sh
  5. Update dremio-env to add Privacera jars and configuration in the Dremio classpath.

    vi ${DREMIO_HOME}/conf/dremio-env
  6. Add the following variable, or update it if it already exists:

    DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*
  7. Restart Dremio.

    sudo service dremio restart
AWS EMR

This topic shows how to configure AWS EMR with Privacera using Privacera Manager.

Configuration

  1. SSH to the instance as USER.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.emr.yml config/custom-vars/
    vi config/custom-vars/vars.emr.yml
  3. Edit the following properties.

    Property

    Description

    Example

    EMR_ENABLE

    Enable EMR template creation.

    true

    EMR_CLUSTER_NAME

    Define a unique name for the EMR cluster.

    Privacera-EMR

    EMR_CREATE_SG

    Set this to true if you don't have existing security groups and want Privacera Manager to take care of adding security group creation steps in the EMR CF template.

    false

    EMR_MASTER_SG_ID

    If EMR_CREATE_SG is false, set this property. Security Group ID for EMR Master Node Group.

    sg-xxxxxxx

    EMR_SLAVE_SG_ID

    If EMR_CREATE_SG is false, set this property. Security Group ID for EMR Slave Node Group.

    sg-xxxxxxx

    EMR_SERVICE_ACCESS_SG_ID

    If EMR_CREATE_SG is false, set this property. Security Group ID for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a Private Network.

    sg-xxxxxxx

    EMR_SG_VPC_ID

    If EMR_CREATE_SG is true, set this property. VPC ID in which you want to create the EMR Cluster.

    vpc-xxxxxxxxxxx

    EMR_MASTER_SG_NAME

    If EMR_CREATE_SG is true, set this property. Security Group Name for EMR Master Node Group. The security group name will be added to the emr-template.json.

    priv-master-sg

    EMR_SLAVE_SG_NAME

    If EMR_CREATE_SG is true, set this property. Security Group Name for EMR Slave Node Group. The security group name will be added to the emr-template.json.

    priv-slave-sg

    EMR_SERVICE_ACCESS_SG_NAME

    If EMR_CREATE_SG is true, set this property. Security Group Name for EMR ServiceAccessSecurity. The security group name will be added to the emr-template.json. Fill this property only if you are creating EMR in a Private Network.

    priv-private-sg

    EMR_SUBNET_ID

    Subnet ID

    EMR_KEYPAIR

    An existing EC2 key pair to SSH into the master node of the cluster.

    privacera-test-pair

    EMR_EC2_MARKET_TYPE

    Set market type as SPOT or ON_DEMAND.

    SPOT

    EMR_EC2_INSTANCE_TYPE

    Set the instance type. Instances can be of different types such as m5.xlarge, r5.xlarge and so on.

    m5.large

    EMR_MASTER_NODE_COUNT

    Node count for Master. The number of nodes can be 1, 2 and so on.

    1

    EMR_CORE_NODE_COUNT

    Node count for Core. The number of cores can be 1, 2 and so on.

    1

    EMR_VERSION

    Version of EMR.

    emr-x.xx.x

    EMR_EC2_DOMAIN

    Domain used by the nodes. It depends on EMR Region, for example, ".ec2.internal" is for us-east-1.

    .ec2.internal

    EMR_USE_STS_REGIONAL_ENDPOINTS

    Set the property to enable/disable regional endpoints for S3 requests.

    Default value is false.

    true

    EMR_TERMINATION_PROTECT

    Set to enable/disable termination protection.

    true

    EMR_LOGS_PATH

    S3 location for storing EMR logs.

    s3://privacera-logs-bucket/

    EMR_KERBEROS_ENABLE

    Set to true if you want to enable kerberization on EMR.

    false

    EMR_KDC_ADMIN_PASSWORD

    If EMR_KERBEROS_ENABLE is true, set this property. The password used within the cluster for the kadmin service.

    EMR_CROSS_REALM_PASSWORD

    If EMR_KERBEROS_ENABLE is true, set this property. The cross-realm trust principal password, which must be identical across realms.

    EMR_SECURITY_CONFIG

    Name of the Security Configurations created for EMR. This can be a pre-created configuration, or Privacera Manager can generate a template through which you can create this configuration.

    EMR_KERB_TICKET_LIFETIME

    Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The period for which a Kerberos ticket issued by the cluster’s KDC is valid. Cluster applications and services auto-renew tickets after they expire.

    EMR_KERB_TICKET_LIFETIME: 24

    EMR_KERB_REALM

    Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The Kerberos realm name for the other realm in the trust relationship.

    EMR_KERB_DOMAIN

    Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The domain name of the other realm in the trust relationship.

    EMR_KERB_ADMIN_SERVER

    Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The fully qualified domain name (FQDN) and an optional port for the Kerberos admin server in the other realm. If a port is not specified, 749 is used.

    EMR_KERB_KDC_SERVER

    Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The fully qualified domain name (FQDN) and an optional port for the KDC in the other realm. If a port is not specified, 88 is used.

    EMR_AWS_ACCT_ID

    AWS Account ID where EMR Cluster resides

    9999999

    EMR_DEFAULT_ROLE

    Default role attached to EMR Cluster for performing cluster-related activities. This should be a pre-created role.

    EMR_DefaultRole

    EMR_ROLE_FOR_CLUSTER_NODES

    The IAM Role will be attached to each node in the EMR Cluster.

    This should have only minimal permissions for downloading the privacera_cust_conf.zip and basic EMR capabilities. It can be an existing one, if not, you can use the IAM role CF template to generate it after the Privacera Manager update.

    restricted_node_role

    EMR_USE_SINGLE_ROLE_FOR_APPS

    If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. Create a Single IAM Role that will be used by All EMR Applications.

    true

    EMR_ROLE_FOR_APPS

    If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by all EMR Apps

    app_data_access_role

    EMR_ROLE_FOR_SPARK

    If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. Create multiple IAM Roles to be used by specific applications. Set EMR_USE_SINGLE_ROLE_FOR_APPS to be false. IAM Role name which will be used by Spark Application (Dataserver) for data access.

    spark_data_access_role

    EMR_ROLE_FOR_HIVE

    If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by Hive Application for data access.

    hive_data_access_role

    EMR_ROLE_FOR_PRESTO

    If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by Presto Application for data access.

    presto_data_access_role

    EMR_HIVE_METASTORE

    Metastore type. e.g. "glue", "hive" (For external hive-metastore)

    glue

    EMR_HIVE_METASTORE_PATH

    S3 location for hive metastore

    s3://hive-warehouse

    EMR_HIVE_METASTORE_CONNECTION_URL

    If EMR_HIVE_METASTORE is hive, set this property. JDBC Connection URL for connecting to hive.

    jdbc:mysql://<jdbc-host>:3306/<hive-db-name>?createDatabaseIfNotExist=true

    EMR_HIVE_METASTORE_CONNECTION_DRIVER

    If EMR_HIVE_METASTORE is hive, set this property. JDBC Driver Name

    org.mariadb.jdbc.Driver

    EMR_HIVE_METASTORE_CONNECTION_USERNAME

    If EMR_HIVE_METASTORE is hive, set this property. JDBC UserName

    hive

    EMR_HIVE_METASTORE_CONNECTION_PASSWORD

    If EMR_HIVE_METASTORE is hive, set this property. JDBC Password

    StRong@PassW0rd

    EMR_HIVE_SERVICE_NAME

    Custom hive service name for hive application in EMR

    teamA_policy

    EMR_TRINO_HIVE_SERVICE_NAME

    Custom hive service name for trino application in EMR

    teamB_policy

    EMR_SPARK_HIVE_SERVICE_NAME

    Custom hive access service name for spark applications in EMR

    teamC_policy

    EMR_APP_SPARK_OLAC_ENABLE

    To install Spark application with Privacera plugin, set the property to true. OLAC is known as Object Level Access Control.

    Note:

    • Recommended when complete access control on the objects in AWS S3 is required.

    • When the property is set to true, s3 and s3n protocols will not be supported on EMR clusters while running Spark queries.

    true

    EMR_APP_SPARK_FGAC_ENABLE

    To install Spark application with Privacera plugin, set the property to true. FGAC is known as Fine Grained Access Control for Table and Column.

    Note: Recommended for compliance purposes, since the whole cluster will still have direct access to AWS S3 data.

    false

    EMR_APP_PRESTO_DB_ENABLE

    To install PrestoDB application with Privacera plugin, set the property to true.

    PrestoDB and Trino are mutually exclusive. Only one should be enabled at a time.

    false

    EMR_APP_PRESTO_SQL_ENABLE

    To install Trino application with Privacera plugin, set the property to true.

    PrestoDB and Trino are mutually exclusive. Only one should be enabled at a time.

    Note: Trino is supported for EMR versions 6.1.0 and higher.

    Note: If the EMR version is 6.4.0, setting this flag installs the Trino plugin.

    false

    EMR_APP_HIVE_ENABLE

    To install Hive application with Privacera plugin, set the property to true.

    true

    EMR_APP_ZEPPELIN_ENABLE

    To install Zeppelin application, set the property to true.

    true

    EMR_APP_LIVY_ENABLE

    To install Livy application, set the property to true.

    true

    EMR_CUST_CONF_ZIP_PATH

    A path where the privacera_cust_conf.zip file will be placed should be added. Privacera Manager will generate a privacera_cust_conf.zip under ~/privacera/privacera-manager/output/emr folder. This privacera_cust_conf.zip needs to be placed at an s3 or any https location from which the EMR cluster can download it.

    s3://privacera-artifacts/

    EMR_SPARK_ENABLE_VIEW_LEVEL_ACCESS_CONTROL

    Set the property to true to enable view-level column masking and row filter for SparkSQL. The property can be used only when you set EMR_APP_SPARK_FGAC_ENABLE to true.

    To learn how to use view-level access control in Spark, click here.

    false

    EMR_RANGER_IS_FALLBACK_SUPPORTED

    Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.

    To enable the fallback, set to true; to disable, set to false.

    true

    EMR_SPARK_DELTA_LAKE_ENABLE

    Set this property to true to enable Delta Lake on EMR Spark.

    true

    EMR_SPARK_DELTA_LAKE_CORE_JAR_DOWNLOAD_URL

    Download URL of Delta Lake core JAR. The Delta Lake core JAR has dependency with Spark version.

    You have to find the appropriate version for your EMR. See Delta Lake compatibility with Apache Spark.

    Get the appropriate Delta Lake core JAR download link and update the property. See Delta Core.

    For example, for Spark version 3.1.x, the download URL is https://repo1.maven.org/maven2/io/delta/delta-core_2.12/1.0.1/delta-core_2.12-1.0.1.jar.

    https://repo1.maven.org/maven2/io/delta/delta-core_2.12/1.0.1/delta-core_2.12-1.0.1.jar

    If your cluster was running while External Hive Metastore was down, and you are unable to connect to it, restart the following three servers.

    sudo systemctl restart hive-hcatalog-server
    sudo systemctl restart hive-server2
    sudo systemctl restart presto-server
  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update

    After the update is finished, all the cloud-formation JSON template files and privacera_cust_conf.zip will be available at the path, ~/privacera/privacera-manager/output/emr.

  5. Configure and run the following in AWS instance where Privacera is installed.

    1. (Optional) Create IAM roles using the emr-roles-creation-template.json template. Run the following command.

      aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-role-creation --template-body file://emr-roles-creation-template.json --capabilities CAPABILITY_NAMED_IAM

      Note

      This will create IAM roles with minimal permissions. You can add bucket permissions into respective IAM roles as per your requirements.

    2. (Optional) Create Security Configurations using the emr-security-config-template.json template. Run the following command.

      aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-security-config-creation  --template-body file://emr-security-config-template.json
    3. Confirm the privacera_cust_conf.zip file has been copied to the location specified in EMR_CUST_CONF_ZIP_PATH.

    4. Create EMR using the emr-template.json template. Run the following command.

      aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-creation  --template-body file://emr-template.json

      Note

      If you are upgrading EMR to version 6.4 and higher from EMR version <=6.3 to use Trino plug-in, then you must re-create the EMR security configuration based on the new template generated via PM since the security configuration has trino user newly added

Note

  • For PrestoDB, secrets encryption of Solr authentication password is not supported. However, the properties file where the password resides is accessible only to the presto service user, hence it is invulnerable.

  • If your cluster was running while External Hive Metastore was down, and you are unable to connect to it, restart the following three servers:

    sudo systemctl restart hive-hcatalog-server
    sudo systemctl restart hive-server2
    sudo systemctl restart presto-server
    
AWS EMR with Native Apache Ranger

AWS EMR provides native Apache Ranger integration with the open source Apache Ranger plugins for Apache Spark and Hive. By connecting EMR’s native Ranger with Privacera’s Ranger-based data access governance, it gives the following key advantages:

  • Companies will have the ability to sync their existing policies with their EMR solution.

  • Extend Apache Ranger’s open source capabilities to take advantage of Privacera’s centralized enterprise-ready solution.

Note

Supported EMR version: 5.32 and above in EMR 5.x series.

Prerequisites

AWS Secrets are required for the following to store the Ranger Admin and Ranger plugin certificates.

  • ranger-admin-pub-cert

  • ranger-plugin-private-keypair

To create the two secrets in AWS Secret Manager, do the following:

  1. Login to AWS console and navigate to Secrets Manager and then click Store a new secret option.

  2. Select secret type as Other type of secrets and then go to the Plaintext tab. Keep the Default value unchanged. The actual value for this secret will be obtained after the installation is done.

  3. Select the encryption key as per your requirement.

  4. Click Next.

  5. Under Secret name, type a name for the secret in the text field. For example: ranger-admin-pub-cert, ranger-plugin-private-keypair.

  6. Click Next. The Configure automatic rotation page is displayed.

  7. Click Next.

  8. On the Review page, you can check your secret settings and then click Store to save your changes.

    The Secret is stored successfully.

Configuration

  1. SSH to the instance as USER.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.emr.native.ranger.yml config/custom-vars/
    vi config/custom-vars/vars.emr.native.ranger.yml
    
  3. Edit the following properties.

    Property

    Description

    Example

    EMR_NATIVE_ENABLE

    Property to enable EMR native Ranger integration.

    EMR_NATIVE_ENABLE: "true"

    Properties for EMR Specifications

    EMR_NATIVE_CLUSTER_NAME

    Name of the EMR Cluster.

    EMR_NATIVE_CLUSTER_NAME: "Privacera-EMR-Native-Ranger"

    EMR_NATIVE_AWS_REGION

    AWS Region where the cluster will reside.

    EMR_NATIVE_AWS_REGION: "{{AWS_REGION}}"

    EMR_NATIVE_AWS_ACCT_ID

    AWS Account ID where the EMR Cluster and its resources will reside.

    EMR_NATIVE_AWS_ACCT_ID: "587946681758"

    EMR_NATIVE_SUBNET_ID

    Subnet ID where the EMR Cluster nodes will reside.

    EMR_NATIVE_SUBNET_ID: ""

    EMR_NATIVE_KEYPAIR

    An existing EC2 key pair to SSH into the node of cluster

    EMR_NATIVE_KEYPAIR: "privacera-test-pair"

    EMR_NATIVE_EC2_MARKET_TYPE

    Market Type for the EMR Cluster nodes. For example, SPOT or ON_DEMAND.

    EMR_NATIVE_EC2_MARKET_TYPE: "SPOT"

    EMR_NATIVE_EC2_INSTANCE_TYPE

    Instance Type for the EMR Cluster nodes.

    EMR_NATIVE_EC2_INSTANCE_TYPE: "m5.2xlarge"

    EMR_NATIVE_MASTER_NODE_COUNT

    Node count for Master.

    EMR_NATIVE_MASTER_NODE_COUNT: "1"

    EMR_NATIVE_CORE_NODE_COUNT

    Node count for Core.

    EMR_NATIVE_CORE_NODE_COUNT: "1"

    EMR_NATIVE_VERSION

    EMR Native Ranger integation is supported from 5.32 and above.

    EMR_NATIVE_VERSION: "emr-5.32.0"

    EMR_NATIVE_TERMINATION_PROTECT

    To enable termination protection.

    EMR_NATIVE_TERMINATION_PROTECT: "true"

    EMR_NATIVE_LOGS_PATH

    S3 location for EMR logs storage.

    EMR_NATIVE_LOGS_PATH: "s3://privacera-emr/logs"

    Properties to configure EMR Security Group

    EMR_NATIVE_CREATE_SG

    Set this to true, if you don't have existing security groups and want Privacera Manager to take care of adding security groups creation steps in EMR CloudFormation Template.

    EMR_NATIVE_CREATE_SG: "false"

    If EMR_NATIVE_CREATE_SG is false, fill the following properties with existing security group ids:

    EMR_NATIVE_MASTER_SG_ID

    Security Group ID for EMR Master Node Group.

    EMR_NATIVE_MASTER_SG_ID: "sg-xxxxxxx"

    EMR_NATIVE_SLAVE_SG_ID

    Security Group ID for EMR Slave Node Group.

    EMR_NATIVE_SLAVE_SG_ID: "sg-xxxxxxx"

    EMR_NATIVE_SERVICE_ACCESS_SG_ID

    Security Group ID for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a private network.

    EMR_NATIVE_SERVICE_ACCESS_SG_ID: "sg-xxxxxxx"

    If EMR_NATIVE_CREATE_SG is true, fill the following properties to give security group names for new groups which will be added in emr-template.json :

    EMR_NATIVE_SG_VPC_ID

    VPC ID in which you want to create the EMR Cluster.

    EMR_NATIVE_SG_VPC_ID: "vpc-xxxxxxxxxxx"

    EMR_NATIVE_MASTER_SG_NAME

    Security Group Name for EMR Master Node Group.

    EMR_NATIVE_MASTER_SG_NAME: "priv-master-sg"

    EMR_NATIVE_SLAVE_SG_NAME

    Security Group Name for EMR Slave Node Group.

    EMR_NATIVE_SLAVE_SG_NAME: "priv-slave-sg"

    EMR_NATIVE_SERVICE_ACCESS_SG_NAME

    Security Group Name for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a private network.

    EMR_NATIVE_SERVICE_ACCESS_SG_NAME: "priv-private-sg"

    EMR_NATIVE_SECURITY_CONFIG

    Name of the security configurations created for EMR. This can be an existing configuration or Privacera Manager can generate a template through which new configurations can be created. The new template will be available at ~/privacera/privacera-manager/output/emr/emr-native-sec-config-template.json after you run the Privacera Manager update command.

    EMR_NATIVE_SECURITY_CONFIG: ""

    Properties for EMR Hive Metastore

    EMR_NATIVE_HIVE_METASTORE

    Metastore type. For example, internal, hive (For external hive-metastore)

    EMR_NATIVE_HIVE_METASTORE: "hive"

    EMR_NATIVE_HIVE_METASTORE_WAREHOUSE_PATH

    S3 location for Hive metastore warehouse

    EMR_NATIVE_HIVE_METASTORE_WAREHOUSE_PATH: "s3://hive-warehouse"

    Fill the following properties, if EMR_NATIVE_HIVE_METASTORE is hive:

    EMR_NATIVE_METASTORE_CONNECTION_URL

    JDBC Connection URL for connecting to Hive Metastore.

    EMR_NATIVE_METASTORE_CONNECTION_URL: jdbc:mysql://<jdbc-host>:3306/<hive-db-name>?createDatabaseIfNotExist=true

    EMR_NATIVE_METASTORE_CONNECTION_DRIVER

    JDBC Driver Name

    EMR_NATIVE_METASTORE_CONNECTION_DRIVER: "org.mariadb.jdbc.Driver"

    EMR_NATIVE_METASTORE_CONNECTION_USERNAME

    JDBC UserName

    EMR_NATIVE_METASTORE_CONNECTION_USERNAME: "hive"

    EMR_NATIVE_METASTORE_CONNECTION_PASSWORD

    JDBC Password

    EMR_NATIVE_METASTORE_CONNECTION_PASSWORD: "StRong@PassWord"

    Properties of Kerberos Server

    EMR_NATIVE_KDC_ADMIN_PASSWORD

    The password used within the cluster for the kadmin service.

    EMR_NATIVE_KDC_ADMIN_PASSWORD: ""

    EMR_NATIVE_CROSS_REALM_PASSWORD

    The cross-realm trust principal password, which must be identical across realms.

    EMR_NATIVE_CROSS_REALM_PASSWORD: ""

    EMR_NATIVE_KERB_TICKET_LIFETIME

    The period for which a Kerberos ticket issued by the cluster’s KDC is valid. Cluster applications and services auto-renew tickets after they expire.

    EMR_NATIVE_KERB_TICKET_LIFETIME: 24

    EMR_NATIVE_KERB_REALM

    The Kerberos realm name for the other realm in the trust relationship.

    EMR_NATIVE_KERB_REALM: ""

    EMR_NATIVE_KERB_DOMAIN

    The domain name of the other realm in the trust relationship.

    EMR_NATIVE_KERB_DOMAIN: ""

    EMR_NATIVE_KERB_ADMIN_SERVER

    The fully qualified domain name (FQDN) and optional port for the Kerberos admin server in the other realm. If a port is not specified, 749 is used.

    EMR_NATIVE_KERB_ADMIN_SERVER: ""

    EMR_NATIVE_KERB_KDC_SERVER

    The fully qualified domain name (FQDN) and optional port for the KDC in the other realm. If a port is not specified, 88 is used.

    EMR_NATIVE_KERB_KDC_SERVER: ""

    Properties of Certificates Secrets

    EMR_NATIVE_RANGER_PLUGIN_SECRET_ARN

    Full ARN of AWS secret [stored in AWS Secrets Manager] for Ranger plugin key-pair. This is the secret created in the Prerequisites step above.

    EMR_NATIVE_RANGER_PLUGIN_SECRET_ARN: "arn:aws:secretsmanager:us-east-1:99999999999:secret:ranger-plugin-key-pair-ixZbO2"

    EMR_NATIVE_RANGER_ADMIN_SECRET_ARN

    Full ARN of AWS secret [stored in AWS Secrets Manager] for Ranger admin public certificate. This is the secret created in the Prerequisites step above.

    EMR_NATIVE_RANGER_ADMIN_SECRET_ARN: "arn:aws:secretsmanager:us-east-1:99999999999:secret:ranger-admin-public-cert-ixfCO5"

    Properties of EMR application

    EMR_NATIVE_APP_SPARK_ENABLE

    Installs Spark application with EMR native Ranger plugin, if set to true.

    EMR_NATIVE_APP_SPARK_ENABLE: "true"

    EMR_NATIVE_APP_HIVE_ENABLE

    Installs Hive application with EMR native Ranger plugin, if set to true.

    EMR_NATIVE_APP_HIVE_ENABLE: "true"

    EMR_NATIVE_APP_ZEPPELIN_ENABLE

    Installs Zeppelin application, if set to true.

    EMR_NATIVE_APP_ZEPPELIN_ENABLE: "true"

    EMR_NATIVE_APP_LIVY_ENABLE

    Installs Livy application, if set to true.

    EMR_NATIVE_APP_LIVY_ENABLE: "true"

    Properties of IAM Role Configuration

    EMR_NATIVE_DEFAULT_ROLE

    Default role attached to EMR cluster for performing cluster related activities. This should be an existing role.

    EMR_NATIVE_DEFAULT_ROLE: "EMR_DefaultRole"

    EMR_NATIVE_INSTANCE_ROLE

    The IAM Role which will be attached to each node in the EMR Cluster. This should have only minimal permissions for basic EMR functionalities.

    EMR_NATIVE_INSTANCE_ROLE: "restricted_instance_role"

    EMR_NATIVE_DATA_ACCESS_ROLE

    This role provides credentials for trusted execution engines, such as Apache Hive and AWS EMR Record Server AWS EMR Components, to access AWS S3 data. Use this role only to access AWS S3 data, including any KMS keys, if you are using S3 SSE-KMS.

    EMR_NATIVE_DATA_ACCESS_ROLE: "emr_native_data_access_role"

    EMR_NATIVE_USER_ACCESS_ROLE

    This role provides users who are not trusted execution engines with credentials to interact with AWS services, if needed. Do not use this IAM role to allow access to AWS S3 data, unless its data that should be accessible by all users.

    EMR_NATIVE_USER_ACCESS_ROLE: "emr_native_user_access_role"

    Properties to send EMR Ranger Engines Audits to Solr

    EMR_NATIVE_ENABLE_SOLR_AUDITS

    Enable audits to Solr.

    EMR_NATIVE_ENABLE_SOLR_AUDITS: "true"

    AUDITSERVER_AUTH_TYPE

    EMR Native Ranger Audits Frameworks does not support basic authentication, hence this needs to be disabled. This property needs to changed in vars.auditserver.yml, if already existing.

    AUDITSERVER_AUTH_TYPE: "none"

    AUDITSERVER_SSL_ENABLE

    Incase of self-signed SSL, EMR native Ranger does not support SSL for Solr audits. Hence, AuditServer SSL should be disabled.

    AUDITSERVER_SSL_ENABLE: "false"

    EMR_NATIVE_CLOUDWATCH_GROUPNAME

    Add a CloudWatch LogGroup to push Ranger Audits. This should be an existing Group.

    EMR_NATIVE_CLOUDWATCH_GROUPNAME: "emr_privacera_native_logs"

    Note

    You can also add custom properties that are not included by default. See EMR.

  4. Run the following commands.

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
    
  5. Once update is done, all the CloudFormation JSON template files will be available at ~/privacera/privacera-manager/output/emr-native-ranger path.

  6. Run the following command in the AWS instance where Privacera is installed.

    cd ~/privacera/privacera-manager/output/emr-native-ranger
    
  7. Create the certificates which needs to be added in AWS Secrets Manager.

    You will get multiple prompts to enter the keystore password. Use the property value of RANGER_PLUGIN_SSL_KEYSTORE_PASSWORD set in ~/privacera/privacera-manager/config/custom-vars/vars.ssl.yml for each prompt.

    1. Run the following command.

      ./emr-native-create-certs.sh
      

      This will create the following two files. You need to update the secrets in both the files, which was created in the Prerequisites section above:

      • ranger-admin-pub-cert.pem

      • ranger-plugin-keypair.pem

    2. Display the contents of the ranger-admin-pub-cert.pem file.

      cat ranger-admin-pub-cert.pem
      

      Select the file contents and then right-click in the terminal to copy the contents.

    3. Login to AWS console and navigate to Secrets Manager and then click ranger-admin-pub-cert.

    4. Navigate to Secret value section and then go to Retrieve Secret Value > Edit > Plaintext.

    5. Replace the secrets with the new value, which you copied in step 2.

    6. Similarly, follow the steps b-e above to display the file contents of ranger-plugin-keypair.pem and use the contents to replace the value of the ranger-plugin-private-keypair secrets in the AWS Secrets Manager.

  8. (Optional) Create IAM roles using the emr-native-role-creation-template.json template.

    aws --region <AWS_REGION> cloudformation create-stack --stack-name privacera-emr-native-role-creation --template-body file://emr-native-role-creation-template.json --capabilities CAPABILITY_NAMED_IAM
    

    Note

    For giving access to data for Apache Hive and Apache Spark services, navigate to IAM Management in your AWS Console and add required S3 policies in the EMR_NATIVE_DATA_ACCESS_ROLE.

  9. (Optional) Create Security Configurations using the emr-native-sec-config-template.json template.

    aws --region <AWS_REGION> cloudformation create-stack --stack-name privacera-emr-native-security-config-creation  --template-body file://emr-native-sec-config-template.json
    
  10. Create EMR using the emr-native-template.json template.

    aws --region <AWS_REGION> cloudformation create-stack --stack-name privacera-emr-native-creation  --template-body file://emr-native-template.json
    
GCP Dataproc
Privacera plugin in Dataproc

This section covers how you can use Privacera Manager to generate the setup script and Dataproc custom configuration to install Privacera Plugin in the GCP Dataproc environment.

Prerequisites

Ensure the following prerequisites are met:

  • A working Dataproc environment.

  • Privacera services must be up and running.

Configuration

  1. SSH to the instance where Privacera is installed.

  2. Run the following command:

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.dataproc.yml config/custom-vars/
    vi config/custom-vars/vars.dataproc.yml                          
  3. Edit the following properties:

    Property

    Description

    Example

    DATAPROC_ENABLE

    Enable Dataproc template creation.

    true

    DATAPROC_MANAGE_INIT_SCRIPT

    Set this property to upload the init script to GCP Cloud Storage.

    If the value is set to true, then Privacera will upload the init script to the GCP bucket.

    If the value is set to false, then manually upload the init script to a GCP bucket.

    false

    DATAPROC_PRIVACERA_GS_BUCKET

    Enter the GCP bucket name where the init script will be uploaded.

    gs://privacera-bucket

    DATAPROC_RANGER_IS_FALLBACK_SUPPORTED

    Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.

    To enable the fallback, set to true; to disable, set to false.

    true

  4. Run the update.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update                           

    After the update is complete, the setup script setup_dataproc.sh and Dataproc custom configurations privacera_cust_conf.zip will be generated at the path, ~/privacera/privacera-manager/output/dataproc.

  5. If DATAPROC_MANAGE_INIT_SCRIPT is set to false, then copy setup_dataproc.sh and privacera_cust_conf.zip. Both the files should be placed under the same folder.

    cd ~/privacera/privacera-manager/output/dataproc
    GS_BUCKET=<PLEASE_CHANGE>
    gsutil cp setup_dataproc.sh gs://${GS_BUCKET}/privacera/dataproc/init/
    gsutil cp privacera_cust_conf.zip gs://${GS_BUCKET}/privacera/dataproc/init/                          
  6. SSH to the instance where the master node of the Dataproc is installed. Then, enter the GCP bucket name and run the setup script.

    sudo su - 
    mkdir -p /opt/privacera/downloads
    cd /opt/privacera/downloads
    GS_BUCKET=privacera-dev
    gsutil cp gs://${GS_BUCKET}/privacera/dataproc/init/setup_dataproc.sh .
    chmod +x setup_dataproc.sh
    ./setup_dataproc.sh                           
Starburst Enterprise
Starburst Enterprise with Privacera

Using Privacera in Starburst Enterprise LTS, you can enforce system-wide access control. The following information can help provide an expedient way of configuring Starburst Enterprise with port 8443 for TLS/HTTPS so that usernames/passwords are possible. Self-signed certificates work well for testing purposes, but not to be used for production deployments.

Prerequisites

The following items need to be enabled/shared prior to deploying a Starburst Docker image:

  • A licensed version of Starburst

  • Docker-ce 18+ must be installed

  • JDK 11 (to generate the Java keystore)

  • Privacera Manager version 4.7 or higher

  • JDBC URL to connect to the Starburst Enterprise instance to access the catalogs and schemas

  • CA-signed SSL certificate for production deployment.

Configuring Privacera Plugin with Starburst Enterprise

Summary of steps:

  1. Generate an access-control file for Starburst.

  2. Generate an access-control file for Hive catalogs [optional].

  3. Generate a Ranger Audit XML file.

  4. Generate a Ranger SSL XML file required for TLS secure Privacera installations.

To configure Privacera plugin:

  1. To enable Privacera for authorization, you need to update the etc/config.properties with one of the following entries:

    # privacera auth for hive and system access control
    access-control.config-files=/etc/starburst/access-control-privacera.properties,/etc/starburst/access-control-priv-hive.properties
    

    Or

    # privacera auth for only system access control
    access-control.config-files=/etc/starburst/access-control-privacera.properties
    
  2. Edit etc/access-control-privacera.properties. The following is an example of the properties. You need to configure the properties in the file, so that it points to the instance where Privacera is installed. Replace <PRIVACERA_HOST_INSTANCE_IP> with the IP address of Privacera host.

    access-control.name=privacera-starburst
    ranger.policy-rest-url=http://<PRIVACERA_HOST_INSTANCE_IP>:6080
    ranger.service-name=privacera_starburstenterprise
    ranger.username=admin
    ranger.password=welcome1
    ranger.policy-refresh-interval=3s
    ranger.config-resources=/etc/starburst/ranger-hive-audit.xml
    ranger.policy-cache-dir=/etc/starburst/tmp/ranger
    

    To install this file into the Docker container, you can add option to your container creation script:

    -v $DOCKER_HOME/$STARBURST_VERSION/etc/access-control-privacera.properties:$STARBURST_TGT/access-control-privacera.properties \
  3. Edit etc/access-control-priv-hive.properties. The following is an example of the properties. You need to configure the properties in the file, so that it points to the instance where Privacera is installed. Replace <PRIVACERA_HOST_INSTANCE_IP> with the IP address of Privacera host. Similarly, you need to configure the properties of the comma-separated files such as Hive, Glue, Delta, and so on.

    This file is optional if you are not configuring Hive catalogs with privacera_hive policies.

    access-control.name=privacera
    ranger.policy-rest-url=http://<PRIVACERA_HOST_INSTANCE_IP>:6080
    ranger.service-name=privacera_hive
    privacera.catalogs=hive,glue
    ranger.username=admin
    ranger.password=welcome1
    ranger.policy-refresh-interval=3s
    ranger.config-resources=/etc/starburst/ranger-hive-audit.xml
    ranger.policy-cache-dir=/etc/starburst/tmp/ranger
    privacera.fallback-access-control=allow-all
    
  4. To install this file into the Docker container, you can add option to your container creation script:

    -v $DOCKER_HOME/$STARBURST_VERSION/etc/access-control-priv-hive.properties:$STARBURST_TGT/access-control-priv-hive.properties \
  5. Edit etc/ranger-hive-audit.xml. This file describes the method of auditing the access from Starburst to Privacera Ranger and Solr. The example below is for unsecured Privacera Ranger deployments only. Replace <PRIVACERA_HOST_INSTANCE_IP> with the IP address of Privacera host.

        <?xml version="1.0" encoding="UTF-8"?>
        <configuration>
        <property>
        <name>ranger.plugin.hive.service.name</name>
        <value>privacera_hive</value>
        </property>
        <property>
        <name>ranger.plugin.hive.policy.pollIntervalMs</name>
        <value>5000</value>
        </property>
        <property>
        <name>ranger.service.store.rest.url</name>
        <value>http://<PRIVACERA_HOST_INSTANCE_IP>:6080</value>
        </property>
        <property>
        <name>ranger.plugin.hive.policy.rest.url</name>
        <value>http://<PRIVACERA_HOST_INSTANCE_IP>:6080</value>
        </property>
        <property>
        <name>xasecure.audit.destination.solr</name>
        <value>true</value>
        </property>
        <property>
        <name>xasecure.audit.destination.solr.batch.filespool.dir</name>
        <value>/opt/presto/logs/audits/solr/</value>
        </property>
        <property>
        <name>xasecure.audit.destination.solr.urls</name>
        <value>http://<PRIVACERA_HOST_INSTANCE_IP>:8983/solr/ranger_audits</value>
        </property>
        <property>
        <name>xasecure.audit.is.enabled</name>
        <value>true</value>
        </property>
        </configuration>
    
  6. To install this file into the Docker container, you can add option to your container creation script:

    -v $DOCKER_HOME/$STARBURST_VERSION/etc/ranger-hive-audit.xml:$STARBURST_TGT/ranger-hive-audit.xml \
Privacera services (Data Assets)
Privacera services

This topic covers how you can enable/disable Data Sets menu on Privacera Portal.

Data Sets allows you to create logical data assets from various data sources such Snowflake, PostgreSQL and so on, and share the data assets with users, groups or roles. You can assign an owner to a data asset who has the privileges to control access to the data within the data asset.

CLI configuration
  1. Run the following command.

    cd privacera/privacera-manager/
    cp config/sample-vars/vars.privacera-services.yml  config/custom-vars/
    vi config/custom-vars/vars.privacera-services.yml
  2. Enable/Disable the property.

    PRIVACERA_SERVICES_ENABLE:"true"
  3. Run the following command.

    cd privacera/privacera-manager/
    ./privacera-manager update
Audit Fluentd

Prerequisites

Ensure the following prerequisites are met:

  • AuditServer must be up and running. For more information, refer to AuditServer.

  • If you're configuring Fluentd for an Azure environment and want to configure User Managed Service Identity (MSI), assign the following two IAM roles to the Azure Storage account for the User Managed Service Identity where the audits will be stored.

    • Owner or Contributor

    • Storage Blob Data Owner or Storage Blob Data Contributor

    Note

    If your Azure environment is Docker-based, then configure MSI on a virtual machine, whereas for a Kubernetes-based environment, configure MSI on a virtual machine scale set (VMSS).

This topic covers how you can store the audits from AuditServer locally, or on a cloud, for example, AWS S3, Azure blob, and Azure ADLS Gen 2. You can also send application logs to the same location as the audit logs.

Procedure

  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.audit-fluentd.yml config/custom-vars/
    vi config/custom-vars/vars.audit-fluentd.yml
  3. Modify the properties below. For property details and description, refer to the Configuration Properties below.

    You can also add custom properties that are not included by default. See Audit Fluentd.

  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
Configuration properties

Property

Description

Example

AUDIT_FLUENTD_AUDIT_DESTINATION

Set the audit destination where the audits will be saved. If the value is set to S3, the audits get stored in the AWS S3 server. For S3, the default time interval to publish the audits is 3600s (1hr).

Local storage should be used only for development and testing purposes. All the audit received are stored in the same container/pod.

Value: local, s3, azure-blob, azure-adls

s3

AUDIT_FLUENTD_EXPORT_APP_LOGS_ENABLE

Specifies whether application logs and PolicySync logs are sent to Fluentd. The default value is false.

true

When the destination is local, edit the following property:

AUDIT_FLUENTD_LOCAL_FILE_TIME_INTERVAL

This is the time interval after which the audits will be pushed to the local destination.

3600s

When the destination is s3, edit the following properties:

AUDIT_FLUENTD_S3_BUCKET

Set the bucket name, if you set the audit destination above to S3.

Leave unchanged, if you set the audit destination to local.

bucket_1

AUDIT_FLUENTD_S3_REGION

Set the bucket region, if you set the audit destination above to S3.

Leave unchanged, if you set the audit destination to local.

us-east-1

AUDIT_FLUENTD_S3_FILE_TIME_INTERVAL

This is the time interval after which the audits will be pushed to the S3 destination.

3600s

AUDIT_FLUENTD_S3_ACCESS_KEY

AUDIT_FLUENTD_S3_SECRET_KEY

Set the access and secret key, if you set the audit destination above to S3.

Leave unchanged, if you set the audit destination to local and are using AWS IAM Instance Role.

AUDIT_FLUENTD_S3_ACCESS_KEY: "AKIAIOSFODNN7EXAMPLE"

AUDIT_FLUENTD_S3_SECRET_KEY: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

AUDIT_FLUENTD_S3_BUCKET_ENCRYPTION_TYPE

Property to encrypt an S3 bucket. You can use the property, if you have set S3 as the audit destination in the property, AUDIT_FLUENTD_AUDIT_DESTINATION.

You can assign one of the following values as the encryption types:

  • SSE-S3

  • SSE-KMS

  • SSE-C

  • NONE

SSE-S3 and SSE-KMS are encryptions managed by AWS. You need to enable the server-side encryption for the S3 bucket. For more information on how to enable SSE-S3 or SSE-KMS encryption types, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/default-bucket-encryption.html

SSE-C is the custom encryption type, where the encryption key and MD5 have to generated separately.

NONE

AUDIT_FLUENTD_S3_BUCKET_ENCRYPTION_KEY

If you have set SSE-C encryption type in the AUDIT_FLUENTD_S3_BUCKET_ENCRYPTION_TYPE property, then the encryption key is mandatory. It is optional for SSE-KMS encryption type.

AUDIT_FLUENTD_S3_BUCKET_ENCRYPTION_KEY_MD5

If you have set SSE-C encryption type in the AUDIT_FLUENTD_S3_BUCKET_ENCRYPTION_TYPE property, then the MD5 encryption key is mandatory.

To get the MD5 hash for the encryption key, run the following command:

echo -n "<generated-key>"|  openssl dgst -md5 -binary | openssl enc -base64

When the destination is azure-blob or azure-adls, edit the following properties:

AUDIT_FLUENTD_AZURE_STORAGE_ACCOUNT

AUDIT_FLUENTD_AZURE_CONTAINER

Set the storage account and the container, if you set the audit destination above to Azure Blob or Azure ADLS.

To know how to get the ADLS properties, see Get ADLS properties.

Leave unchanged, if you set the audit destination to local.

Note

Currently, it supports Azure blob storage only.

AUDIT_FLUENTD_AZURE_STORAGE_ACCOUNT: "storage_account_1"

AUDIT_FLUENTD_AZURE_CONTAINER: "container_1"

AUDIT_FLUENTD_AZURE_FILE_TIME_INTERVAL

This is the time interval after which the audits will be pushed to the Azure ADLS/Blob destination.

3600s

AUDIT_FLUENTD_AUTH_TYPE

Select an authentication type from the dropdown list.

AUDIT_FLUENTD_AZURE_STORAGE_ACCOUNT_KEY

AUDIT_FLUENTD_AZURE_STORAGE_SAS_TOKEN

Configure this property, if you have selected SAS Key in the property, AUDIT_FLUENTD_AUTH_TYPE.

Set the storage account key and the SAS token, if you set the audit destination above to Azure Blob.

Leave unchanged, if you're using Azure's Managed Identity Service.

AUDIT_FLUENTD_AZURE_OAUTH_TENANT_ID

AUDIT_FLUENTD_AZURE_OAUTH_APP_ID

AUDIT_FLUENTD_AZURE_OAUTH_SECRET

Set the storage account key and the SAS token, if you set the audit destination above to Azure ADLS.

Configure this property, if you have selected OAUTH in the property, AUDIT_FLUENTD_AUTH_TYPE.

Leave unchanged, if you're using Azure's Managed Identity Service.

AUDIT_FLUENTD_AZURE_USER_MANAGED_IDENTITY_ENABLE

AUDIT_FLUENTD_AZURE_USER_MANAGED_IDENTITY

Configure this property, if you have selected MSI (UserManaged) in the property, AUDIT_FLUENTD_AUTH_TYPE.

Related Information

For further reading, see:

Grafana
How to configure Grafana with Privacera

Privacera allows you to use Grafana as a metric and monitoring system. Grafana dashboards are pre-built in Privcera for services such as Dataserver, PolicySync and Usersync to monitor the health of the services. Grafana uses the time-series data from the Privacera services and turns them into graphs and visualizations.

Grafana uses Graphite's query to pull the time-series data and create charts and graphs based on this data.

Supported services

The following services are supported on Grafana:

  • Dataserver

  • PolicySync

  • Usersync

Configuration steps

  1. To enable Grafana, run the following command. This will enable both Grafana and Graphite.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.grafana.yml config/custom-vars/
  2. Run the update.

    cd ~/privacera/privacera-manager/
    ./privacera-manager.sh

Note

After configuring Grafana, if the data does not appear on the dashboard, see Grafana service.

Ranger Tagsync

This topic shows how you can configure Ranger TagSync to synchronize the Ranger tag store with Atlas.

Configuration

  1. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.ranger-tagsync.yml config/custom-vars/
    vi config/custom-vars/vars.ranger-tagsync.yml
  2. Edit the following properties.

    Property

    Description

    Example

    RANGER_TAGSYNC_ENABLE

    Property to enable/disable the Ranger TagSync.

    true

    TAGSYNC_TAG_SOURCE_ATLAS_KAFKA_BOOTSTRAP_SERVERS

    Kakfa bootstrap server where Atlas publishes the entities. Tagsync listens and pushes the mapping of Atlas entities and tags to Ranger.

    kafka:9092

    TAGSYNC_TAG_SOURCE_ATLAS_KAFKA_ZOOKEEPER_CONNECT

    Zookeeper URL for Kafka.

    zoo-1:2181

    TAGSYNC_ATLAS_CLUSTER_NAME

    Atlas cluster name.

    privacera

    TAGSYNC_TAGSYNC_ATLAS_TO_RANGER_SERVICE_MAPPING

    (Optional) To map from Atlas Hive cluster-name to Ranger service-name, the following format is used:

    clusterName,componentType,serviceName;clusterName2,componentType2,serviceName2

    Note: There are no spaces in the above format.

    For Hive, the notifications from Atlas include the name of the entities in the following format:

    dbName@clusterName dbName.tblName@clusterName dbName.tblName.colName@clusterName

    Ranger Tagsync needs to derive the name of the Hive service (in Ranger) from the above entity names. By default, Ranger computes Hive service name as: clusterName + “_hive".

    If the name of the Hive service (in Ranger) is different in your environment, use following property to enable Ranger Tagsync to derive the correct Hive service name.

    TAGSYNC_ATLAS_TO_RANGER_SERVICE_MAPPING = clusterName,hive,rangerServiceName

    {{TAGSYNC_ATLAS_CLUSTER_NAME}},hive,privacera_hive;{{TAGSYNC_ATLAS_CLUSTER_NAME}},s3,privacera_s3

    TAGSYNC_TAGSYNC_ATLAS_DEFAULT_CLUSTER_NAME

    (Optional) Default cluster name configured for Atlas.

    {{TAGSYNC_ATLAS_CLUSTER_NAME}}

    TAGSYNC_TAG_SOURCE_ATLAS_KAFKA_ENTITIES_GROUP_ID

    (Optional) Consumer Group Name to be used to consume Kafka events.

    privacera_ranger_entities_consumer

    Note

    You can also add custom properties that are not included by default. See Ranger TagSync.

  3. Run the following command.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update

Discovery

Discovery in Kubernetes

This section provides setup instructions for Privacera Discovery for a Kubernetes based deployment.

Prerequisites

Ensure the following prerequisite is met:

  • Privacera services must be deployed using Kubernetes.

  • Embedded Spark must be used.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.discovery.kubernetes.yml config/custom-vars/
    vi custom-vars/vars.discovery.kubernetes.yml
    
  3. Set value for the following. For property details and description, refer to the Configuration Properties below.

    DISCOVERY_K8S_SPARK_MASTER: "${PLEASE_CHANGE}"
Configuration properties

To get the value of the variable, do the following:

  1. Get the URL for Kubernetes master by executing kubectl cluster-info command.

  2. Copy the Kubernetes control plane URL and paste it.

    kubernetes_url.jpg
Discovery on Databricks
Discovery on Databricks

This topic covers the installation of Privacera Discovery on Databricks.

Configuration
  1. SSH to the instance as USER.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.discovery.databricks.yml config/custom-vars/
    vi custom-vars/vars.discovery.databricks.yml
    
  3. Add and provide the following details in custom-vars/vars.discovery.databricks.yml file if the Databricks plugin is not enabled. To configure Databricks plugin, see Configuration in Databricks Spark Fine-Grained Access Control Plugin (FGAC) (Python, SQL).

    DATABRICKS_HOST_URL: "<PLEASE_UPDATE>"
    DATABRICKS_TOKEN: "<PLEASE_UPDATE>"
    
    DATABRICKS_WORKSPACES_LIST:
    - alias: DEFAULT
        databricks_host_url: "{{DATABRICKS_HOST_URL}}"
        token: "{{DATABRICKS_TOKEN}}"
    
  4. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    AWS

    DATABRICKS_DRIVER_INSTANCE_TYPE: "m5.xlarge"
    DATABRICKS_INSTANCE_TYPE: "m5.xlarge"
    DATABRICKS_DISCOVERY_MANAGE_INIT_SCRIPT: "true"
    DATABRICKS_DISCOVERY_SPARK_VERSION: "7.3.x-scala2.12"
    DATABRICKS_DISCOVERY_INSTANCE_PROFILE: "arn:aws:iam::<ACCOUNT_ID>:instance-profile/<DATABRICKS_CLUSTER_IAM_ROLE>"
    DISCOVERY_AWS_CLOUD_ASSUME_ROLE: "true"
    DISCOVERY_AWS_CLOUD_ASSUME_ROLE_ARN: "arn:aws:iam::<ACCOUNT_ID>:role/<DISCOVERY_IAM_ROLE>"
    

    Azure

    >
    DATABRICKS_DRIVER_INSTANCE_TYPE: "Standard_DS3_v2"
    DATABRICKS_INSTANCE_TYPE: "Standard_DS3_v2"
    DATABRICKS_DISCOVERY_MANAGE_INIT_SCRIPT: "true"
    DATABRICKS_DISCOVERY_SPARK_VERSION: "7.3.x-scala2.12"

Note

PRIVACERA_DISCOVERY_DATABRICKS_DOWNLOAD_URL is no longer in use. The Discovery Databricks packages will be downloaded from PRIVACERA_BASE_DOWNLOAD_URL.

Configuration properties

Property

Description

Example

DATABRICKS_DRIVER_INSTANCE_TYPE

For AWS driver's instance type can be "m5.xlarge" or "m5.2xlarge"

For Azure driver's instance type can be "Standard_DS3_v2"

m5.xlarge

DATABRICKS_INSTANCE_TYPE

For AWS driver's instance type can be "m5.xlarge" or "m5.2xlarge"

For Azure driver's instance type can be "Standard_DS3_v2"

m5.xlarge

SETUP_DATABRICKS_JAR

USE_DATABRICKS_SPARK

DATABRICKS_ELASTIC_DISK

DATABRICKS_DISCOVERY_MANAGE_INIT_SCRIPT

Set to true if you want to create databricks init script.

false

DATABRICKS_DISCOVERY_WORKERS

DATABRICKS_DISCOVERY_JOB_NAME

DATABRICKS_DISCOVERY_SPARK_VERSION

Spark version can be as follows:

  • 6.4.x-scala2.11 (Spark 2.4)

  • 7.3.x-scala2.12 (Spark 3.0)

  • 7.4.x-scala2.12 (Spark 3.0)

  • 7.5.x-scala2.12 (Spark 3.0)

  • 7.6.x-scala2.12 (Spark 3.0)

7.3.x-scala2.12

DATABRICKS_DISCOVERY_INSTANCE_PROFILE

Property is used for the instance role, for the Databricks instance node where your discovery will be running

arn:aws:iam::1234564835:instance-profile/privacera_databricks_cluster_iam_role

DISCOVERY_AWS_CLOUD_ASSUME_ROLE

Property to grant Discovery access to AWS services to perform the scanning operation.

true

DISCOVERY_AWS_CLOUD_ASSUME_ROLE_ARN

ARN of the AWS IAM Role

arn:aws:iam::12345671758:role/DiscoveryCrossAccAssumeRole_k

Discovery in AWS
Discovery

This topic allows you to set up the AWS configuration for installing Privacera Discovery in a Docker and Kubernetes (EKS) environment.

IAM policies

To use the Privacera Discovery service, ensure the following IAM policies are attached to the Privacera_PM_Role role to access the AWS services.

Policy to create AWS resources

Policy to create AWS resources is required only during installation or when Discovery is updated through Privacera Manager. This policy gives permissions to Privacera Manager to create AWS resources like DynamoDB, Kinesis, SQS, and S3 using terraform.

  • ${AWS_REGION}: AWS region where the resources will get created.

     {
    "Version":"2012-10-17",
    "Statement":[
        {
            "Sid":"CreateDynamodb",
            "Effect":"Allow",
            "Action":[
                "dynamodb:CreateTable",
                "dynamodb:DescribeTable",
                "dynamodb:ListTables",
                "dynamodb:TagResource",
                "dynamodb:UntagResource",
                "dynamodb:UpdateTable",
                "dynamodb:UpdateTableReplicaAutoScaling",
                "dynamodb:UpdateTimeToLive",
                "dynamodb:DescribeTimeToLive",
                "dynamodb:ListTagsOfResource",
                "dynamodb:DescribeContinuousBackups"
            ],
            "Resource":"arn:aws:dynamodb:${AWS_REGION}:*:table/privacera*"
        },
        {
            "Sid":"CreateKinesis",
            "Effect":"Allow",
            "Action":[
                "kinesis:CreateStream",
                "kinesis:ListStreams",
                "kinesis:UpdateShardCount"
            ],
            "Resource":"arn:aws:kinesis:${AWS_REGION}:*:stream/privacera*"
        },
        {
            "Sid":"CreateS3Bucket",
            "Effect":"Allow",
            "Action":[
                "s3:CreateBucket",
                "s3:ListAllMyBuckets",
                "s3:GetBucketLocation"
                
            ],
            "Resource":[
                "arn:aws:s3:::*"
            ]
        },
        {
            "Sid":"CreateSQSMessages",
            "Effect":"Allow",
            "Action":[
                "sqs:CreateQueue",
                "sqs:ListQueues"
            ],
            "Resource":[
                "arn:aws:sqs:${AWS_REGION}:${ACCOUNNT_ID}:privacera*"
            ]
        }
    ]
    }
 
CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Configure your environment.

    • Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)

    • For a Docker environment, you can skip this step.

  3. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.discovery.aws.yml config/custom-vars/
    vi config/custom-vars/vars.discovery.aws.yml
    
  4. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    DISCOVERY_BUCKET_NAME: "<PLEASE_CHANGE>"
    

    To configure a bucket, add the property as follows, where bucket-1 is the name of the bucket:

    DISCOVERY_BUCKET_NAME: "bucket-1"
    

    To configure a bucket containing a folder, add the property as follows:

    DISCOVERY_BUCKET_NAME: "bucket-1/folder1"
    
  5. Uncomment/Add the following variable to enable Autoscalability of Executor pods:

    DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_ENABLED: "true"
    
  6. (Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.

    For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.

  7. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

DISCOVERY_BUCKET_NAME

Set the bucket name where Discovery will store its metadata files

container1

[Properties of Topic and Table names](../pm-ig/customize_topic_and_tables_names.md)

Topic and Table names are assigned by default in Privacera Discovery. To customize any topic or table name, refer to the link.

Enable realtime scan

An AWS SQS queue is required, if you want to enable realtime scan on the S3 bucket.

After running the PM update command, an SQS queue will be created for you automatically with the name, privacera_bucket_sqs_{{DEPLOYMENT_ENV_NAME}}, where {{DEPLOYMENT_ENV_NAME}} is the environment name you set in the vars.privacera.yml file. This queue name will appear in the list of queues of your AWS SQS account.

If you have an SQS queue which you want to use, add the DISCOVERY_BUCKET_SQS_NAME property in the vars.discovery.aws.yml file and assign your SQS queue name.

If you want to enable realtime scan on the bucket, click here.

Discovery in Azure
Azure Discovery

This topic allows you to setup the Azure configuration for installing Privacera Discovery.

Prerequisites

Ensure the following prerequisites are met:

Azure storage account

Azure Cosmos DB account

  • Create an Azure Cosmos DB, For more information, refer to Microsoft's documentation Cosmos DB .

  • Get the URI from the Overview section.

  • Get the Primary Key from the Settings > Keys section.

  • Set the consistency to Strong in the Settings > Default Consistency section.

For Terraform

  • Assign permissions to create Azure resources using managed-identity. For more information, refer to Create Azure Resources .

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Configure your environment.

    • Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)

    • For a Docker environment, you can skip this step.

  3. Run the following commands.

    cd ~/privacera/privacera-manager  
    cp config/sample-vars/vars.kafka.yml config/custom-vars
    vi config/custom-vars/vars.kafka.yml
    
  4. Run the following commands.

    cd ~/privacera/privacera-manager  
    cp config/sample-vars/vars.discovery.azure.yml config/custom-vars
    vi config/custom-vars/vars.discovery.azure.yml
    
  5. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    DISCOVERY_FS_PREFIX: "<PLEASE_CHANGE>"
    DISCOVERY_AZURE_STORAGE_ACCOUNT_NAME: <PLEASE_CHANGE>"
    DISCOVERY_COSMOSDB_URL: <PLEASE_CHANGE>"
    DISCOVERY_COSMOSDB_KEY: "<PLEASE_CHANGE>"
    DISCOVERY_AZURE_STORAGE_ACCOUNT_KEY: "<PLEASE_CHANGE>"
    CREATE_AZURE_RESOURCES: "false"
    DISCOVERY_AZURE_RESOURCE_GROUP: "<PLEASE_CHANGE>"
    DISCOVERY_AZURE_COSMOS_DB_ACCOUNT: "<PLEASE_CHANGE>"
    DISCOVERY_AZURE_LOCATION: "<PLEASE_CHANGE>"
    
  6. (Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.

    For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.

  7. To configure real-time scan for audits, refer to Pkafka.

  8. Run the following commands.

    cd ~/privacera/privacera-manager  
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

DISCOVERY_ENABLE

In the **Basic** tab, enable/disable Privacera Discovery.

DISCOVERY_REALTIME_ENABLE

In the **Basic** tab, enable/disable real-time scan in Privacera Discovery.

For real-time scan to work, ensure the following:

  • If you want to scan the default ADLS app registered by the system at the time of installation, keep its app properties unchanged in Privacera Portal.

  • If you want to scan a user-registered app, the app properties in Privacera Portal and its corresponding discovery.yml should be the same.

  • At a time, only one app can be scanned.

DISCOVERY_FS_PREFIX

Enter the container name. Get it from the Prerequisites section.

container1

DISCOVERY_AZURE_STORAGE_ACCOUNT_NAME

Enter the name of the Azure Storage account. Get it from the Prerequisites section.

azurestorage

DISCOVERY_COSMOSDB_URL

DISCOVERY_COSMOSDB_KEY

Enter the Cosmos DB URL and Primary Key. Get it from the Prerequisites section.

DISCOVERY_COSMOSDB_URL: "https://url1.documents.azure.com:443/"

DISCOVERY_COSMOSDB_KEY: "xavosdocof"

DISCOVERY_AZURE_STORAGE_ACCOUNT_KEY

Enter the Access Key of the storage account. Get it from the Prerequisites section.

GMi0xftgifp==

[Properties of Topic and Table names](../pm-ig/customize_topic_and_tables_names.md)

Topic and Table names are assigned by default in Privacera Discovery. To customize any topic or table name, refer to the link.

PKAFKA_EVENT_HUB

In the **Advanced > Pkafka Configuration** section, enter the Event Hub name. Get it from the Prerequisites section.

eventhub1

PKAFKA_EVENT_HUB_NAMESPACE

In the **Advanced > Pkafka Configuration** section, enter the name of the Event Hub namespace. Get it from the Prerequisites section.

eventhubnamespace1

PKAFKA_EVENT_HUB_CONSUMER_GROUP

In the **Advanced > Pkafka Configuration** section, enter the name of the Consumer Group. Get it from the Prerequisites section.

congroup1

PKAFKA_EVENT_HUB_CONNECTION_STRING

In the **Advanced > Pkafka Configuration** section, enter the connection string. Get it from the Prerequisites section.

Endpoint=sb://eventhub1.servicebus.windows.net/;

SharedAccessKeyName=RootManageSharedAccessKey;

SharedAccessKey=sAmPLEP/8PytEsT=

CREATE_AZURE_RESOURCES

For terraform usage, assign the value as true. Its default value is false.

true

DISCOVERY_AZURE_RESOURCE_GROUP

Get the value from the Prerequisite section.

resource1

DISCOVERY_AZURE_COSMOS_DB_ACCOUNT

Get the value from the Prerequisite section.

database1

Discovery in GCP
Discovery

This topic allows you to set up the GCP configuration for installing Privacera Discovery in a Docker and Kubernetes environment.

Prerequisites

Ensure the following prerequisites are met:

  • Create a service account and add the following roles. For more information, refer to Creating a new service account.

    • Editor

    • Owner

    • Private Logs Viewer

    • Kubernetes Engine Admin (Required only for a Kubernetes environment)

  • Create a Bigtable instance and get the Bigtable Instance ID. For more information, refer to Creating a Cloud Bigtable instance.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Configure your environment.

    • Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)

    • For a Docker environment, you can skip this step.

  3. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.discovery.gcp.yml config/custom-vars/
    vi config/custom-vars/vars.discovery.gcp.yml
    
  4. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    BIGTABLE_INSTANCE_ID: "<PLEASE_CHANGE>"
    DISCOVERY_BUCKET_NAME: "<PLEASE_CHANGE>"
    
  5. (Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.

    For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.

  6. For real-time scanning, run the following.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.pkafka.gcp.yml config/custom-vars/
    

    Note

    • Recommended: Use Google Sink based approach to enable real-time scan of applications on different projects, click here.

    • Optional: Use Google Logging API based approach to enable real-time scan of applications on different projects, click here.

  7. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

BIGTABLE_INSTANCE_ID

Get the value by navigating to **Navigation Menu->Databases->BigTable->Check the instance id column**.

BIGTABLE_INSTANCE_ID: "table_1"

DISCOVERY_BUCKET_NAME

Give a name where the Discovery will store it's metadata files.

DISCOVERY_BUCKET_NAME="bucket_1"

Pkafka

This topic allows you to enable Pkafka for real-time audits in Privacera Discovery.

Prerequisites

Ensure the following prerequisites are met:

  • Create an Event Hub namespace with a region similar to the region of a Storage Account you want to monitor. For more information, refer to Microsoft's documentation Create an Event Hubs namespace.

  • Create Event Hub in the Event Hub namespace. For more information, refer to Microsoft's documentation Create an event hub.

  • Create a consumer group in the Event Hub.

    Azure Portal > Event Hubs namespace > Event Hub > Consumer Groups > +Consumer Group. The Consumer Groups tab will be under Entities of the Event Hub page.

  • Get the connection string of the Event Hubs namespace. For more information, refer to Microsoft's documentation Get connection string from the portal.

  • Create an Event Subscription for the Event Hubs namespace with the Event Type as Blob Created and Blob Deleted. For more information, refer to Microsoft's documentation Create an Event Grid subscription.

    Note

    When you create an event grid subscription, clear the checkbox Enable subject filtering.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.pkafka.azure.yml config/custom-vars/
    vi config/custom-vars/vars.pkafka.azure.yml
  3. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    PKAFKA_EVENT_HUB: "<PLEASE_CHANGE>"
    PKAFKA_EVENT_HUB_NAMESPACE: "<PLEASE_CHANGE>"
    PKAFKA_EVENT_HUB_CONSUMER_GROUP: "<PLEASE_CHANGE>"
    PKAFKA_EVENT_HUB_CONNECTION_STRING: "<PLEASE_CHANGE>"
    DISCOVERY_REALTIME_ENABLE: "true"
  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
Configuration properties

Property

Description

Example

PKAFKA_EVENT_HUB

Enter the Event Hub name. Get it from the Prerequisites section above.

eventhub1

PKAFKA_EVENT_HUB_NAMESPACE

Enter the name of the Event Hub namespace. Get it from the Prerequisites section above.

eventhubnamespace1

PKAFKA_EVENT_HUB_CONSUMER_GROUP

Enter the name of the Consumer Group. Get it from the Prerequisites section above.

congroup1

PKAFKA_EVENT_HUB_CONNECTION_STRING

Enter the connection string. Get it from the Prerequisites section above.

Endpoint=sb://eventhub1.servicebus.windows.net/;

SharedAccessKeyName=RootManageSharedAccessKey;

SharedAccessKey=sAmPLEP/8PytEsT=

DISCOVERY_REALTIME_ENABLE

Add this property to enable/disable real-time scan. By default, it is set to false.

Note: This is a custom property, and has to be added separately to the YAML file.

For real-time scan to work, ensure the following:

  • If you want to scan the default ADLS app registered by the system at the time of installation, keep its app properties unchanged in Privacera Portal.

  • If you want to scan a user-registered app, the app properties in Privacera Portal and its corresponding discovery.yml should be the same.

  • At a time, only one app can be scanned.

true

Portal SSO with PingFederate

Privacera portal leverages PingIdentity’s Platform Portal for authentication via SAML. For this integration, there are configuration steps in both Privacera portal and PingIdentity.

Configuration steps for PingIdentity
  1. Sign in to your PingIdentity account.

  2. Under Your Environments , click Administrators.

  3. Select Connections from the left menu.

  4. In the Applications section, click on the + button to add a new application.

  5. Enter an Application Name (such as Privacera Portal SAML) and provide a description (optionally add an icon). For the Application Type, select SAML Application. Then click Configure.

  6. On the SAML Configuration page, under "Provide Application Metadata", select Manually Enter.

  7. Enter the ACS URLs:

    https://<portal_hostname>:<PORT>/saml/SSO

    Enter the Entity ID:

    privacera-portal

    Click the Save button.

  8. On the Overview page for the new application, click on the Attributes edit button. Add the attribute mapping:

    user.login: Username

    Set as Required.

    Note

    If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.

  9. You can optionally add additional attribute mappings:

    user.email: Email Address 
    user.firstName: Given Name
    user.lastName: Family Name
  10. Click the Save button.

  11. Next in your application, select Configuration and then the edit icon.

  12. Set the SLO Endpoint:

    https://<portal_hostname>:<PORT>/login.html

    Click the Save button.

  13. In the Configuration section, under Connection Details, click on Download Metadata button.

  14. Once this file is downloaded, rename it to:

    privacera-portal-aad-saml.xml

    This file will be used in the Privacera Portal configuration.

Configuration steps in Privacera Portal

Now we will configure Privacera Portal using privacera-manager to use the privacera-portal-aad-saml.xml file created in the above steps.

  1. Run the following commands:

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
  2. Edit the vars.portal.saml.aad.yml file:

    vi config/custom-vars/vars.portal.saml.aad.yml

    Add the following properties:

    SAML_ENTITY_ID: "privacera-portal"
    SAML_BASE_URL: "https://{{app_hostname}}:{port}"
    PORTAL_UI_SSO_ENABLE: "true"
    PORTAL_UI_SSO_URL: "saml/login"
    PORTAL_UI_SSO_BUTTON_LABEL: "Single Sign On"
    AAD_SSO_ENABLE: "true"
  3. Copy the privacera-portal-aad-saml.xml file to the following folder:

    ~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
  4. Edit the vars.portal.yml file:

    cd ~/privacera/privacera-manager/
    vi config/custom-vars/vars.portal.yml

    Add the following properties and assign your values.

    SAML_EMAIL_ATTRIBUTE: "user.email"
    SAML_USERNAME_ATTRIBUTE: "user.login"
    SAML_LASTNAME_ATTRIBUTE: "user.lastName"
    SAML_FIRSTNAME_ATTRIBUTE: "user.firstName"
  5. Run the following to update privacera-manager:

    cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update

    You should now be able to use Single Sign-on to Privacera using PingFederate.

Encryption & Masking

Privacera Encryption Gateway (PEG) and Cryptography with Ranger KMS

This topic covers how you can set up and use Privacera Cryptography and Privacera Encryption Gateway (PEG) using Ranger KMS.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Create a 'crypto' configuration file, and set the value of the Ranger KMS Master Key Password.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.crypto.yml config/custom-vars/
    vi config/custom-vars/vars.crypto.yml

    Assign a password to the RANGER_KMS_MASTER_KEY_PASSWORD such as "Str0ngP@ssw0rd".

    RANGER_KMS_MASTER_KEY_PASSWORD: "<PLEASE_CHANGE>"
  3. Run the following command.

    cp config/sample-vars/vars.peg.yml config/custom-vars/
  4. (Optional) If you want to customize PEG configuration further, you can add custom PEG properties. For more information, refer to PEG Custom Properties.

    For example, by default, the username and password for the PEG service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.

  5. Run Privacera Manager to update the Privacera Platform configuration:

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update

    If this is a Kubernetes deployment, update all Privacera services:

    ./privacera-manager.sh update
AWS S3 bucket encryption

You can set up server-side encryption for AWS S3 bucket to encrypt the resources in the bucket. Supported encryption types are Amazon S3 (SSE-S3), AWS Key Management Service (SSE-KMS), and Customer-Provided Keys (SSE-C). Encryption key is mandatory for the encryption type SSE-C and optional for SSE-KMS. No encryption key is required for SSE-S3. For more information, see Protecting data using server-side encryption in the AWS documentation.

Configure bucket encryption in dataserver
  1. SSH to EC2 instance where Privacera Dataserver is installed.

  2. Enable use of bucket encryption configuration in Privacera Dataserver.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.dataserver.aws.yml config/custom-vars/
    vi config/custom-vars/vars.dataserver.aws.yml
    

    Add the new property.

    DATA_SERVER_AWS_S3_ENCRYPTION_ENABLE:"true"DATA_SERVER_AWS_S3_ENCRYPTION_MAPPING:-"bucketA|<encryption-type>|<base64encodedssekey>"-"bucketB*,BucketC|<encryption-type>|<base64encodedssekey>"
    

    Property

    Description

    DATA_SERVER_AWS_S3_ENCRYPTION_ENABLE

    Property to enable or disable the AWS S3 bucket encryption support.

    DATA_SERVER_AWS_S3_ENCRYPTION_MAPPING

    Property to set the mapping of S3 buckets, encryption SSE type, and SSE key (base64 encoded ). For example, "bucketC*,BucketD|SSE-KMS|<base64 encoded sse key>".

    The base64-encoded encryption key should be set for the following: 1) Encryption type is set to SSE-KMS and customer managed CMKs is used for encryption. 2) Encryption type is set to SSE-C.

Server-Side encryption with Amazon S3-Managed Keys (SSE-S3)

Supported S3 APIs for SSE-S3 Encryption:

  • PUT Object

  • PUT Object - Copy

  • POST Object

  • Initiate Multipart Upload

Bucket policy
{"Version":"2012-10-17","Id":"PutObjectPolicy","Statement":[{"Sid":"DenyIncorrectEncryptionHeader","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-s3-encrypted-bucket}}/*","Condition":{"StringNotEquals":{"s3:x-amz-server-side-encryption":"AES256"}}},{"Sid":"DenyUnencryptedObjectUploads","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-s3-encrypted-bucket}}/*","Condition":{"Null":{"s3:x-amz-server-side-encryption":"true"}}}]}
  • Upload a test file.

    aws s3 cp myfile.txt s3://{{sse-s3-encrypted-bucket}}/
    
Server-Side encryption with CMKs stored in AWS Key Management Service (SSE-KMS)

Supported APIs for SSE-KMS Encryption:

  • PUT Object

  • PUT Object - Copy

  • POST Object

  • Initiate Multipart Upload

Your IAM role should have kms:Decrypt permission when you upload or download an Amazon S3 object encrypted with an AWS KMS CMK. This is in addition to the kms:ReEncrypt, kms:GenerateDataKey, and kms:DescribeKey permissions.

AWS Managed CMKs (SSE-KMS)

Bucket Policy

{"Version":"2012-10-17","Id":"PutObjectPolicy","Statement":[{"Sid":"DenyIncorrectEncryptionHeader","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"StringNotEquals":{"s3:x-amz-server-side-encryption":"aws:kms"}}},{"Sid":"DenyUnencryptedObjectUploads","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"Null":{"s3:x-amz-server-side-encryption":"true"}}}]}
  • Upload a test file.

    aws s3 cp myfile.txt s3://{{sse-s3-encrypted-bucket}}/
    
Customer Managed CMKs (SSE-KMS)

Bucket Policy

{"Version":"2012-10-17","Id":"PutObjectPolicy","Statement":[{"Sid":"DenyIncorrectEncryptionHeader","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"StringNotEquals":{"s3:x-amz-server-side-encryption":"aws:kms"}}},{"Sid":"RequireKMSEncryption","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"StringNotLikeIfExists":{"s3:x-amz-server-side-encryption-aws-kms-key-id":"{{aws-kms-key}}"}}},{"Sid":"DenyUnencryptedObjectUploads","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"Null":{"s3:x-amz-server-side-encryption":"true"}}}]}
  • Upload a test file.

    aws s3 cp privacera_aws.sh s3://{{sse-kms-encrypted-bucket}}/
    
Server-Side encryption with Customer-Provided Keys (SSE-C)

Supported APIs for SSE-C Encryption:

  • PUT Object

  • PUT Object - Copy

  • POST Object

  • Initiate Multipart Upload

  • Upload Part

  • Upload Part - Copy

  • Complete Multipart Upload

  • Get Object

  • Head Object

  • Update the privacera_aws_config.json file with bucket and SSE-C encryption key.

    • Run AWS S3 upload.

      aws s3 cp myfile.txt s3://{{sse-c-encrypted-bucket}}/
      
    • Run head-object.

      aws s3api head-object --bucket {{sse-c-encrypted-bucket}} --key myfile.txt
      

Sample keys:

Key

Value

AES256-bit key

E1AC89EFB167B29ECC15FF75CC5C2C3A

Base64-encoded encryption key (sseKey)

echo -n "E1AC89EFB167B29ECC15FF75CC5C2C3A" | openssl enc -base64

Base64-encoded 128-bit MD5 digest of the encryption key

echo -n "E1AC89EFB167B29ECC15FF75CC5C2C3A" | openssl dgst -md5 -binary | openssl enc -base64

Ranger KMS
Integrate with Azure key vault

This topic shows how to configure Ranger Key Management Storage (KMS) system with Azure Key Vault to enable the use of data encryption. The master key for the encryption is created within the KMS and stored in Azure Key Vault. This section describes how to set up the connection from Ranger KMS to the Azure Key Vault to store the master key in the Azure key vault instead of the Ranger database.

Note: You can manually move the Ranger KMS from the Ranger database to the Azure Key Vault. For more information, refer to Migrate Ranger KMS Master Key

Prerequisites
CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.crypto.azurekeyvault.yml config/custom-vars/
    vi config/custom-vars/vars.crypto.azurekeyvault.yml
  3. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    AZURE_KEYVAULT_SSL_ENABLED: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_CLIENT_ID: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_CLIENT_SECRET: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_CERT_FILE: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_CERTIFICATE_PASSWORD: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_MASTERKEY_NAME: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_MASTER_KEY_TYPE: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_ZONE_KEY_ENCRYPTION_ALGO: "<PLEASE_CHANGE>"
    AZURE_KEYVAULT_URL: "<PLEASE_CHANGE>"
  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
Configuration properties

Property

Description

Example

AZURE_KEYVAULT_SSL_ENABLED

Activate Azure Key Vault.

true

AZURE_KEYVAULT_CLIENT_ID

Get the ID by following the Pre-requisites section above.

50fd7ca6-xxxx-xxxx-a13f-1xxxxxxxx

AZURE_KEYVAULT_CLIENT_SECRET

Get the client secret by following the Pre-requisites section above.

<AzureKeyVaultPassword>

AZURE_KEYVAULT_CERT_FILE

Get the file by following the Pre-requisites section above.

Ensure the file is copied in the config/ssl folder, and give it a name.

azure-key-vault.pem

AZURE_KEYVAULT_CERTIFICATE_PASSWORD

Get the value by following the Pre-requisites section above.

certPass

AZURE_KEYVAULT_MASTERKEY_NAME

Enter the name of the master key. A key with this name will be created in Azure Key Vault.

RangerMasterKey

AZURE_KEYVAULT_MASTER_KEY_TYPE

Enter a type of master key.

Values: RSA, RSA_HSM, EC, EC_HSM, OCT

RSA

AZURE_KEYVAULT_ZONE_KEY_ENCRYPTION_ALGO

Enter an encryption algorithm for the master key.

Values: RSA_OAEP, RSA_OAEP_256, RSA1_5, RSA_OAEP

RSA_OAEP

AZURE_KEYVAULT_URL

Get the URL by following the Pre-requisites section above.

https://keyvault.vault.azure.net/

AuthZ / AuthN

LDAP / LDAP-S for Privacera portal access
LDAP / LDAP-S for Privacera Portal access

This configuration sequence configures the Privacera Portal to reference an external LDAP or LDAP over SSL directory for the purpose of Privacera Portal user login authentication.

Prerequisites

Before starting these steps, prepare the following. You need to configure various Privacera properties with these values, as detailed in Configuration.

Determine the following LDAP values:

  • The FQDN and protocol (http or https) of your LDAP server

  • Complete Bind DN

  • Bind DN password

  • Top-level search base

  • User search base

  • Group search base

  • Username attribute

  • DN attribute

To configure an SSL-enabled LDAP server, Privacera requires an SSL certificate. You have these alternatives:

  • Set the Privacera property PORTAL_LDAP_SSL_ENABLED: "true".

  • Allow Privacera Manager to download and create the certificate based on the LDAP server URL. Set the Privacera property PORTAL_LDAP_SSL_PM_GEN_TS: "true".

  • Manually configure a truststore on the Privacera server that contains the certificate of the LDAP server. Set the Privacera property PORTAL_LDAP_SSL_PM_GEN_TS: "false".

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the commands below.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.portal.ldaps.yml config/custom-vars/
    vi config/custom-vars/vars.portal.ldaps.yml
    
  3. Uncomment the properties and edit the configurations as required. For property details and description, refer to the Configuration Properties below.

    PORTAL_LDAP_ENABLE: "true"
    PORTAL_LDAP_URL: "<PLEASE_CHANGE>"
    PORTAL_LDAP_BIND_DN: "<PLEASE_CHANGE>"
    PORTAL_LDAP_BIND_PASSWORD: "<PLEASE_CHANGE>"
    PORTAL_LDAP_SEARCH_BASE: "<PLEASE_CHANGE>"
    PORTAL_LDAP_USER_SEARCH_BASE: "<PLEASE_CHANGE>"
    PORTAL_LDAP_GROUP_SEARCH_BASE: "<PLEASE_CHANGE>"
    PORTAL_LDAP_USERNAME_ATTRIBUTE: "<PLEASE_CHANGE>"
    PORTAL_LDAP_DN_ATTRIBUTE: "<PLEASE_CHANGE>"
    PORTAL_LDAP_BIND_ANONYMOUSLY: "false"
    PORTAL_LDAP_SSL_ENABLED: "true"
    PORTAL_LDAP_SSL_PM_GEN_TS: "true"
    
  4. Run Privacera Manager update.

    >cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

PORTAL_LDAP_URL

Add value as "LDAP_HOST: LDAP_PORT

xxx.example.com:983

PORTAL_LDAP_BIND_DN

CN=Bind User,OU=example,DC=ad,DC=example,DC=com

PORTAL_LDAP_BIND_PASSWORD

Add the password for LDAP

PORTAL_LDAP_SEARCH_BASE

ou=example,dc=ad,dc=example,dc=com

PORTAL_LDAP_USER_SEARCH_BASE

ou=example,dc=ad,dc=example,dc=com

PORTAL_LDAP_GROUP_SEARCH_BASE

OU=example_services,OU=example,DC=ad,DC=example,DC=com

PORTAL_LDAP_USERNAME_ATTRIBUTE

sAMAccountName

PORTAL_LDAP_DN_ATTRIBUTE

PORTAL_LDAP_DN_ATTRIBUTE: dc

PORTAL_LDAP_SSL_ENABLED

For SSL enabled LDAP server, set this value to true.

true

PORTAL_LDAP_SSL_PM_GEN_TS

Set this to true if you want Privacera Manager to generate the truststore for your ldaps server.

Set this to false if you want to manually provide the truststore certificate. To learn how to upload SSL certificates, [click here](../pm-ig/upload_custom_cert.md).

true

Map LDAP roles with the existing Privacera roles

You can associate LDAP users roles to Privacera roles using Privacera LDAP Role Mapping. It allows you to use the access control of Privacera Portal with LDAP user roles.

  1. Log in to Privacera Portal using padmin user credentials or as a user with Privacera ROLE_SYSADMIN role.

  2. Go to Settings > System Configurations.

  3. Select Custom Properties checkbox.

  4. Click on Add Property and enter the new property, auth.ldap.enabled=true.

    image49.jpg
  5. Click Save.

  6. Go to Settings > LDAP Role Mapping.

  7. Add the appropriate role mappings.

    image50.jpg
  8. When you login in back with LDAP user, you will see the new user role. This LDAP user login can be done after the LDAP setup with Privacera Manager is completed.

    image51.jpg
Portal SSO with AAD using SAML

Privacera supports SAML that allows you to authenticate users using single-sign on (SSO) technology. It is way to provide access to use Privacera services.

Using the Azure Active Directory (AAD) SAML Toolkit, you can set up single sign-on (SSO) in Privacera Manager for Active Directory users. After setting up the SSO, you will be provided with an SSO button on the login page of Privacera Portal.

Prerequisites

To configure SSO with Azure Active Directory, you need to configure and enable SSL for the Privacera Portal. See Enable CA Signed Certificates or Enable Self Signed Certificates.

Configuring SAML in Azure AD

The following steps describe how to configure SAML in Azure AD application:

  1. Log in to Azure portal.

  2. On the left navigation pane, select the Azure Active Directory service.

  3. Navigate to Enterprise Applications and then select All Applications.

  4. To add a new application, select New application.

    Note

    If you have an existing Azure AD SAML Toolkit application, select it, and then go to step 8 to continue with the rest of the configuration.

  5. in the search box.Azure AD SAML Toolkit In the Add from the gallery section, type Do the following:

  6. Select Azure AD SAML Toolkit from the results panel and then add the app.

  7. On the Azure AD SAML Toolkit application integration page, in the Manage section and select single sign-on.

  8. On the Select a single sign-on method page, select SAML.

  9. Click the pen icon for Basic SAML Configuration to edit the settings.

  10. On the Basic SAML Configuration page, enter the values for the following fields, and then click Save. You can assign a unique name for the Entity ID.

    • Entity ID = privacera-portal

    • Reply URL = https://${APP_HOSTNAME}:6868/saml/SSO

    • Sign-on URL = https://${APP_HOSTNAME}:6868/login.html

  11. In the SAML Signing Certificate section, find Federation Metadata XML and select Download to download the certificate and save it on your virtual machine.

  12. On the Set up Azure AD SAML Toolkit section, copy the Azure AD Identifier URL.

  13. In the Manage section and select Users and groups.

  14. In the Users and groups dialog, select the user or user group who should be allowed to log in with SSO, then click the Select.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following command:

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
  3. Edit the vars.portal.saml.aad.yml file.

    vi config/custom-vars/vars.portal.saml.aad.yml

    Modify the SAML_ENTITY_ID. You need to assign the value of the Entity ID achieved in the above section. For property details and description, refer to the Configuration Properties below.

    SAML_ENTITY_ID: "privacera-portal"
    SAML_BASE_URL: "https://{{app_hostname}}:6868"
    PORTAL_UI_SSO_ENABLE: "true"
    PORTAL_UI_SSO_URL: "saml/login"
    PORTAL_UI_SSO_BUTTON_LABEL: "Azure AD Login"
    AAD_SSO_ENABLE: "true"
  4. Rename the downloaded Federation Metadata XML file as privacera-portal-aad-saml.xml. Copy this file to the ~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom folder.

  5. Run the following command:

    cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update
  6. If you are configuring the SSL in an Azure Kubernetes environment, then run the following command.

     ./privacera-manager.sh restart portal
Configuration properties

Property

Description

Example

AAD_SSO_ENABLE

Enabled by default.

SAML_ENTITY_ID

Get the value from the Prerequisites section.

privacera-portal

SAML_BASE_URL

https://{{app_hostname}}:6868

PORTAL_UI_SSO_BUTTON_LABEL

Azure AD Login

PORTAL_UI_SSO_URL

saml/login

SAML_GLOBAL_LOGOUT

Enabled by default. The global logout for SAML is enabled. Once a logout is initiated, all the sessions you've accessed from the browser would be terminated from the Identity Provider (IDP).

META_DATA_XML

Browse and select the Federation Metadata XML, which you downloaded in the Prerequisites section.

Validation

Go to the login page of the Privacera Portal. You will see the Azure AD Login button.

Configure SAML assertion attributes

By default, the following assertion attributes are configured with pre-defined values:

  • Email

  • Username

  • Firstname

  • Lastname

You can customize the values for the assertion attributes. To do that, do the following:

  1. Run the following commands.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.yml config/custom-vars/
    vi config/custom-vars/vars.portal.yml
  2. Add the following properties and assign your values. For more information on custom properties and its values, click here.

    SAML_EMAIL_ATTRIBUTE: ""
    SAML_USERNAME_ATTRIBUTE: ""
    SAML_LASTNAME_ATTRIBUTE: ""
    SAML_FIRSTNAME_ATTRIBUTE: ""
  3. Add the properties in the YAML file configured in the Configuration above.

     cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update
Portal SSO with Okta using SAML

Okta is a third-party identity provider, offering single sign-on (SSO) authentication and identity validation services for a large number of Software-as-a-Service providers. PrivaceraCloud works with Okta's SAML (Security Assertion Markup Language) interface to provide an SSO/Okta login authentication to the Privacera portal. For more information, see CLI configuration.

Integration with Okta begins with configuration steps in the Okta administrator console. These steps also generate a Privacera portal account-specific identity_provider_metadata.xml file and an Identity Provider URL that are used in the Privacera CLI configuration steps.

Prerequisites

To configure SSO with Okta , you need to configure and enable SSL for the Privacera Portal. See Enable CA Signed Certificates or Enable Self Signed Certificates.

Note

To use Okta SSO with Privacera portal, you must have already established an Okta SSO service account. The following procedures require Okta SSO administrative login credentials.

Generate an Okta Identity Provider Metadata File and URL
  1. Log in to your Okta account as the Okta SSO account administrator.

  2. Select Applications from the left navigation panel, then click Applications subcategory.

  3. From the Applications page, click Create App Integration.

    Note

    In addition to creating new applications you can also edit existing apps with new configuration values.

  4. Select SAML 2.0, then click Next.

  5. In General Settings, provide a short descriptive app name in the App name text box. For example, enter Privacera Portal SAML.

  6. Click Next.

  7. In the SAML Settings configuration page, enter the values as shown in the following table:

    Field

    Value

    Single sign on URL

    http://portal_hostname:6868/saml/SSO

    Audience URI (SP Entity ID)

    privacera_portal

    Default RelayState

    The value identifies a specific application resource in an IDP initiated SSO scenario. In most cases this field will be left blank.

    Name ID format

    Unspecified

    Application username

    Okta username

    UserID

    user.login

    Email

    user.email

    Firstname

    user.firstName

    LastName

    user.LastName

    Note

    If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.

  8. Click Next.

  9. Select the Feedback tab and click I'm an Okta customer adding an internal app.

  10. Click Finish.

  11. From the General tab, scroll down to the App Embed Link section. Copy the Embed Link (Identity Provider URL) for PrivaceraCloud.

IdP provider metadata

In this topic, you will learn how to generate and save IdP provider metadata in XML format.

  1. Go to Sign On tab.

    > Settings, select the Identity Provider Metadata link located at the bottom of the Sign on methods area. The configuration file will open in a separate window.

  2. In the SAML Signing Certificates section, click the Generate new certificate button.

  3. In the list, click the Actions dropdown and select View IdP metadata.

    The XML file will be opened in a new tab.

    Note

    Make sure that the certificate you are downloading has an active status.

  4. Save the file in XML format.

Idp initiated SSO
  1. From Applications, login to the Okta Home Page Dashboard as a user by selecting the Okta Dashboard icon.

  2. Login to the Privacera Portal by selecting the newly added app icon.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following command:

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
  3. Edit the vars.portal.saml.aad.yml file.

    vi config/custom-vars/vars.portal.saml.aad.yml

    Modify the SAML_ENTITY_ID. You need to assign the value of the Entity ID achieved in the above section. For property details and description, refer to the Configuration Properties below.

    SAML_ENTITY_ID: "privacera-portal"
    SAML_BASE_URL: "https://{{app_hostname}}:6868"
    PORTAL_UI_SSO_ENABLE: "true"
    PORTAL_UI_SSO_URL: "saml/login"
    PORTAL_UI_SSO_BUTTON_LABEL: "Azure AD Login"
    AAD_SSO_ENABLE: "true"
  4. Rename the downloaded Federation Metadata XML file as privacera-portal-aad-saml.xml. Copy this file to the ~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom folder.

  5. Run the following command:

    cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update
  6. If you are configuring the SSL in an Azure Kubernetes environment, then run the following command.

     ./privacera-manager.sh restart portal
Configuration properties

Property

Description

Example

AAD_SSO_ENABLE

Enabled by default.

SAML_ENTITY_ID

Get the value from the Prerequisites section.

privacera-portal

SAML_BASE_URL

https://{{app_hostname}}:6868

PORTAL_UI_SSO_BUTTON_LABEL

Azure AD Login

PORTAL_UI_SSO_URL

saml/login

SAML_GLOBAL_LOGOUT

Enabled by default. The global logout for SAML is enabled. Once a logout is initiated, all the sessions you've accessed from the browser would be terminated from the Identity Provider (IDP).

META_DATA_XML

Browse and select the Federation Metadata XML, which you downloaded in the Prerequisites section.

Validation

Go to the login page of the Privacera Portal. You will see the Okta Login button.

Configure SAML assertion attributes

By default, the following assertion attributes are configured with pre-defined values:

  • Email

  • Username

  • Firstname

  • Lastname

You can customize the values for the assertion attributes. To do that, do the following:

  1. Run the following commands.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.yml config/custom-vars/
    vi config/custom-vars/vars.portal.yml
  2. Add the following properties and assign your values. For more information on custom properties and its values, click here.

    SAML_EMAIL_ATTRIBUTE: ""
    SAML_USERNAME_ATTRIBUTE: ""
    SAML_LASTNAME_ATTRIBUTE: ""
    SAML_FIRSTNAME_ATTRIBUTE: ""
  3. Add the properties in the YAML file configured in the Configuration above.

     cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update
Portal SSO with Okta using OAuth

This topic covers how you can integrate Okta SSO with Privacera Portal using Privacera Manager. Privacera Portal supports Okta as a login provider using OpenId or OAuth or SAML. For more information about SAML configuration, see Portal SSO with Okta using SAML).

Prerequisites

Before you begin, ensure the following prerequisites are met:

  • Setup an Okta Authorization and get the values for the following to use them in the Configuration section below.

  • authorization_endpoint

  • token_endpoint

  • Client ID

  • Client Secret

  • User Info URI

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.okta.yml  config/custom-vars/
    vi config/custom-vars/vars.okta.yml

    Edit the values for the following. For property details and description, refer to the Configuration Properties below.

    OAUTH_CLIENT_CLIENTSECRET : "<PLEASE_CHANGE>"
    OAUTH_CLIENT_CLIENTID : "<PLEASE_CHANGE>"
    OAUTH_CLIENT_TOKEN_URI : "<PLEASE_CHANGE>"
    OAUTH_CLIENT_AUTH_URI : "<PLEASE_CHANGE>"
    OAUTH_RESOURCE_USER_INFO_URI : "<PLEASE_CHANGE>"
    PORTAL_UI_SSO_ENABLE: "true"
  3. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
Configuration properties

Property

Description

Example

OAUTH_CLIENT_CLIENTSECRET

Get it from the Prerequisites section above.

OAUTH_CLIENT_CLIENTSECRET: "4hb88P9UZmxxxxxxxxm1WtqsaQRv1FZDZiaOT0Gm"

OAUTH_CLIENT_CLIENTID

Get it from the Prerequisites section above.

0oa63edjkaoNHGYTS357

OAUTH_CLIENT_TOKEN_URI

Get it from the Prerequisites section above.

https://dev-396511.okta.com/oauth2/default/v1/token

OAUTH_CLIENT_AUTH_URI

Get it from the Prerequisites section above.

https://dev-396511.okta.com/oauth2/default/v1/authorize

OAUTH_RESOURCE_USER_INFO_URI

Get it from the Prerequisites section above.

https://dev-396511.okta.com/oauth2/default/v1/userinfo

PORTAL_UI_SSO_ENABLE

Property to enable/disable OKTA

true

Validation
Login to Privacera Portal using Okta SSO Login
  1. Log in to Privacera Portal.

  2. Click SSO Login button.

    The Okta login page is displayed.

  3. Enter the Okta user login credentials. The Privacera Portal page is displayed.

Login to Privacera Portal using Privacera user credentials
  1. Log in to Privacera Portal.

  2. Enter the user credentials (padmin).

  3. Click Login button. The Privacera Portal page is displayed.

Portal SSO with PingFederate

Privacera portal leverages PingIdentity’s Platform Portal for authentication via SAML. For this integration, there are configuration steps in both Privacera portal and PingIdentity.

Configuration steps for PingIdentity
  1. Sign in to your PingIdentity account.

  2. Under Your Environments , click Administrators.

  3. Select Connections from the left menu.

  4. In the Applications section, click on the + button to add a new application.

  5. Enter an Application Name (such as Privacera Portal SAML) and provide a description (optionally add an icon). For the Application Type, select SAML Application. Then click Configure.

  6. On the SAML Configuration page, under "Provide Application Metadata", select Manually Enter.

  7. Enter the ACS URLs:

    https://<portal_hostname>:<PORT>/saml/SSO

    Enter the Entity ID:

    privacera-portal

    Click the Save button.

  8. On the Overview page for the new application, click on the Attributes edit button. Add the attribute mapping:

    user.login: Username

    Set as Required.

    Note

    If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.

  9. You can optionally add additional attribute mappings:

    user.email: Email Address 
    user.firstName: Given Name
    user.lastName: Family Name
  10. Click the Save button.

  11. Next in your application, select Configuration and then the edit icon.

  12. Set the SLO Endpoint:

    https://<portal_hostname>:<PORT>/login.html

    Click the Save button.

  13. In the Configuration section, under Connection Details, click on Download Metadata button.

  14. Once this file is downloaded, rename it to:

    privacera-portal-aad-saml.xml

    This file will be used in the Privacera Portal configuration.

Configuration steps in Privacera Portal

Now we will configure Privacera Portal using privacera-manager to use the privacera-portal-aad-saml.xml file created in the above steps.

  1. Run the following commands:

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
  2. Edit the vars.portal.saml.aad.yml file:

    vi config/custom-vars/vars.portal.saml.aad.yml

    Add the following properties:

    SAML_ENTITY_ID: "privacera-portal"
    SAML_BASE_URL: "https://{{app_hostname}}:{port}"
    PORTAL_UI_SSO_ENABLE: "true"
    PORTAL_UI_SSO_URL: "saml/login"
    PORTAL_UI_SSO_BUTTON_LABEL: "Single Sign On"
    AAD_SSO_ENABLE: "true"
  3. Copy the privacera-portal-aad-saml.xml file to the following folder:

    ~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
  4. Edit the vars.portal.yml file:

    cd ~/privacera/privacera-manager/
    vi config/custom-vars/vars.portal.yml

    Add the following properties and assign your values.

    SAML_EMAIL_ATTRIBUTE: "user.email"
    SAML_USERNAME_ATTRIBUTE: "user.login"
    SAML_LASTNAME_ATTRIBUTE: "user.lastName"
    SAML_FIRSTNAME_ATTRIBUTE: "user.firstName"
  5. Run the following to update privacera-manager:

    cd ~/privacera/privacera-manager/
    ./privacera-manager.sh update

    You should now be able to use Single Sign-on to Privacera using PingFederate.

JSON Web Tokens (JWT)

This topic shows how to authenticate Privacera services using JSON web tokens (JWT).

Supported services:

Prerequisites

Ensure the following prerequisites are met:

  • Get the identity provider URL that is allowed in the issuer claim of a JWT.

  • Get the public key from the provider that Privacera services can use to validate JWT.

Configuration
  1. SSH to the instance as USER.

  2. Copy the public key in ~/privacera/privacera-manager/config/custom-properties folder. If you are configuring more than one JWT, then copy all the public keys associated with the JWT tokens to the same path.

  3. Run the following commands.

    cd ~/privacera/privacera-manager/config
    cp sample-vars/vars.jwt-auth.yaml custom-vars
    vi custom-vars/vars.jwt-auth.yaml
  4. Edit the properties.

    Table 5. JWT Properties

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    Property to enable JWT auth in Privacera services.

    TRUE

    JWT_CONFIGURATION_LIST

    Property to set multiple JWT configurations.

    • issuer: URL of the identity provider.

    • subject: Subject of the JWT (the user).

    • secret: If the JWT token has been encrypted using secret.

    • publickey: JWT file name that you copied in step 2 above.

    • userKey: Define a unique userkey.

    • groupKey: Define a unique group key.

    • parserType:  Assign one of the following values.

      • PING_IDENTITY: When scope/group is array.

      • KEYCLOAK: When scope/group is space separator.

    JWT_CONFIGURATION_LIST:
      - index: 0
        issuer: "https://your-idp-domain.com/websec"
        subject: "api-token"
        secret: "tprivacera-api"
        publickey: "jwttoken.pub"
        userKey: "client_id"
        groupKey: "scope"
        parserType: "KEYCLOAK"
      - index: 1
        issuer: "https://your-idp-domain.com/websec2"
        publickey: "jwttoken2.pub"
        parserType: "PING_IDENTITY"
      - index: 2
        issuer: "https://your-idp-domain.com/websec3"
        publickey: "jwttoken3.pub"


  5. Run the update.

    cd ~/privacera/privacera-manager/
    
    ./privacera-manager.sh update
    
JWT for Databricks
Configure

To configure JWT for Databricks, do the following:

  1. Enable JWT. To enable JWT, refer Configuration.

  2. (Optional) Create a JWT, if you do not have one. Skip this step, if you already have an existing token.

    To create a token, see JWT and use the following details. For more details, refer the JWT docs.

    • Algorithm=RSA256

    • When JWT_PARSER_TYPE is KEYCLOAKS (scope/group is space separator)

      {
      "scope": "jwt:role1 jwt:role2",
      "client_id": "privacera-test-jwt-user",
      "iss": "privacera","exp": <PLEASE_UPDATE>
      }
    • When JWT_PARSER_TYPE is PING_IDENTITY (scope/group is array)

      {
      "scope": [
          "jwt:role1",
          "jwt:role1"
      ],
      "client_id": "privacera-test-jwt-user",
      "iss": "privacera",
      "exp": <PLEASE_UPDATE>
      }
    • Paste public/private key in input box.

    • Copy the generated JWT Token.

  3. Log in to Databricks portal and write the following JWT file in a cluster file. Then the Privacera plugin can read and perform access-control based on the token user.

    %python
    JWT_TOKEN="<PLEASE_UPDATE>"
    TOKEN_LOCAL_FILE="/tmp/ptoken.dat"
    f = open(TOKEN_LOCAL_FILE, "w")
    f.write(JWT_TOKEN)
    f.close()
Use case

Reading files from the cloud using JWT token

  1. Read the files in the file explorer of your cloud provider from your notebook. Depending on your cloud provider, enter the location of your cloud files in the <path-to-your-cloud-files>.

                    %python spark.read.csv("<path-to-your-cloud-files>").show()
  2. Check the audits. To learn how to check the audits, click here.

    You should get JWT user (privacera-test-jwt-user) which was specified in the payload while creating the JWT.

  3. To give permissions on a resource, create a group in Privacera Portal similar to the scope of the JWT payload and give access to the group, It's not necessary to create a user.

    Privacera plugin extracts the JWT payload and passes the group during access check. In other words, it takes user-group mapping from JWT payload itself, so it's not required to do user-group mapping in Privacera.

JWT for EMR FGAC Spark
Prerequisite
Configuration Steps
  1. First enable JWT, see Configuration above.

  2. Open the vars.emr.yml file.

    cd ~/privacera/privacera-managervi 
    config/custom-vars/vars.emr.yml
  3. Add following property to enable JWT for EMR.

    EMR_JWT_OAUTH_ENABLE: "true"
  4. Run the update.

    cd ~/privacera/privacera-manager/ 
    
    ./privacera-manager.sh update
Validations with JWT Token
  1. Create a JWT, see Step 2 above.

  2. SSH to the EMR master node.

  3. Configure the Spark application as follows:

    JWT_TOKEN=eyJhbGciOiJSU-XXXXXX–X2BAIGWTbywHkfTxxw
    spark-sql --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}" --conf "spark.hadoop.privacera.jwt.oauth.enable=true"

Security

Enable self signed certificates with Privacera Platform

This topic provides instructions for use of Self-Signed Certificates with Privacera services including Privacera Portal, Apache Ranger, Apache Ranger KMS, and Privacera Encryption Gateway. It establishes a secure connection between internal Privacera components (Dataserver, Ranger KMS, Discovery, PolicySync, and UserSync) and SSL-enabled servers.

Note

Support Chain SSL - Preview Functionality

Previously Privacera services were only using one SSL certificate of LDAP server even if a chain of certificates was available. Now as a Preview functionality, all the certificates which are available in the chain certificate are imported it into the truststore. This is added for Privacera usersync, Ranger usersync and portal SSL certificates.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following command.

    cd ~/privacera/privacera-manager 
    cp config/sample-vars/vars.ssl.yml config/custom-vars/ 
    vi config/custom-vars/vars.ssl.ym
  3. Set the passwords for the following configuration. The passwords must be at least six characters and should include alpha, symbol, numerical characters.

    SSL_DEFAULT_PASSWORD: "<PLEASE_CHANGE>" 
    RANGER_PLUGIN_SSL_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" 
    RANGER_PLUGIN_SSL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"

    Note

    You can enable/disable SSL for specific Privacera services. For more information, refer to Configure SSL for Privacera Services.

  4. Run Privacera Manager update.

    cd ~/privacera/privacera-manager
    
    ./privacera-manager.sh update
    
  5. For Kubernetes based deployments, restart services:

    cd ~/privacera/privacera-manager
    
    ./privacera-manager.sh restart
Enable CA signed certificates with Privacera Platform
Enable CA signed certificates with Privacera Platform

This topic provides instructions for use of CA Signed Certificates with Privacera services including Privacera Portal, Apache Ranger, Apache Ranger KMS, and Privacera Encryption Gateway. It establishes a secure connection between internal Privacera components (Dataserver, Ranger KMS, Discovery, PolicySync, and UserSync) and SSL-enabled servers.

Certificate Authority (CA) or third-party generated certificates must be created for the specific hostname subdomain.

Privacera supports signed certificates as 'pem' files.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Copy the public (ssl_cert_full_chain.pem) and private key (ssl_cert_private_key.pem) files to the ~/privacera/privacera-manager/config/ssl/ location.

  3. Create and open the vars.ssl.yml file.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.ssl.yml config/custom-vars/
    vi config/custom-vars/vars.ssl.yml
    
  4. Set values for the following properties:

    • SSL_SELF_SIGNED: false;

    • SSL_DEFAULT_PASSWORD (Use a strong password with upper and lower case, symbols, and numbers);

    • Uncomment Property/Value pairs and set the appropriate value for:

      #PRIVACERA_PORTAL_KEYSTORE_ALIAS
      
      #PRIVACERA_PORTAL_KEYSTORE_PASSWORD
      
      #PRIVACERA_PORTAL_TRUSTSTORE_PASSWORD
      
      #RANGER_ADMIN_KEYSTORE_ALIAS
      
      #RANGER_ADMIN_KEYSTORE_PASSWORD
      
      #RANGER_ADMIN_TRUSTSTORE_PASSWORD
      
      #DATASERVER_SSL_TRUSTSTORE_PASSWORD
      
      #USERSYNC_AUTH_SSL_TRUSTSTORE_PASSWORD
      

      If KMS is enabled, uncomment, and set the following:

      >#RANGER_KMS_KEYSTORE_ALIAS
      
      #RANGER_KMS_KEYSTORE_PASSWORD: "&lt;PLEASE_CHANGE&gt;"
      
      #RANGER_KMS_TRUSTSTORE_PASSWORD: "&lt;PLEASE_CHANGE&gt;"
      

      If PEG enabled, uncomment, and set the following:

      #PEG_KEYSTORE_ALIAS
      
      #PEG_KEYSTORE_PASSWORD
      
      #PEG_TRUSTSTORE_PASSWORD
      
      SSL_SELF_SIGNED: "false"
      SSL_DEFAULT_PASSWORD: "<PLEASE_CHANGE>"
      #SSL_SIGNED_PEM_FULL_CHAIN: "ssl_cert_full_chain.pem"
      #SSL_SIGNED_PEM_PRIVATE_KEY: "ssl_cert_private_key.pem"
      SSL_SIGNED_CERT_FORMAT: "pem"
      
      #PRIVACERA_PORTAL_KEYSTORE_ALIAS: "<PLEASE_CHANGE>"
      #PRIVACERA_PORTAL_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>"
      #PRIVACERA_PORTAL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
      
      #RANGER_ADMIN_KEYSTORE_ALIAS: "<PLEASE_CHANGE>"
      #RANGER_ADMIN_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>"
      #RANGER_ADMIN_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
      
      #DATASERVER_SSL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
      
      #USERSYNC_AUTH_SSL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
      
      #Below is need only if you have KMS enabled
      #RANGER_KMS_KEYSTORE_ALIAS: "<PLEASE_CHANGE>"
      #RANGER_KMS_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>"
      #RANGER_KMS_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
      
      #Below is needed only if you have PEG enabled
      #PEG_KEYSTORE_ALIAS: "<PLEASE_CHANGE>"
      #PEG_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>"
      #PEG_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
      
  5. Add domain names for the Privacera services. See Add Domain Names for Privacera Service URLs.

  6. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
  7. For Kubernetes based deployments, restart services:

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh restart
    
Add domain names for Privacera service URLs

Note

If you have Nginx ingress enabled in your environment, then the configuration described below would not be required. For more information on Nginx ingress, see Externalize Access to Privacera Services - Nginx Ingress.

You can expose Privacera services such as Portal, Ranger, AuditServer, DataServer and PEG to be accessed externally and configure a domain name to point to them. You can use DNS service to host DNS records needed for them.

Configuration
  1. Create a vars.service_hostname.yml file.

    vi config/custom-vars/vars.service_hostname.yml
    
  2. Depending on the services you want to expose, add the properties in the file. Replace <PLEASE_CHANGE> with a hostname.

    PORTAL_HOST_NAME:"<PLEASE_CHANGE>"DATASERVER_HOST_NAME:"<PLEASE_CHANGE>"RANGER_HOST_NAME:"<PLEASE_CHANGE>"PEG_HOST_NAME:"<PLEASE_CHANGE>"AUDITSERVER_HOST_NAME:"<PLEASE_CHANGE>"
    
  3. Create CNAME records to point them to the service load balancer URLs. If you are installing Privacera and its services for the first time, you must complete the installation and then return to this step to create CNAME records.

    1. Run the following command to get the service URL. Replace <name_space> with your Kubernetes namespace.

      kubectl get svc -n <name_space>
      
    2. To create CNAME records using the service URLs, do the following:

  4. Run the update.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Enable password encryption for Privacera services
Enable password encryption for Privacera services

This topic covers how you can enable encryption of secrets for Privacera services such as Privacera Portal, Privacera Dataserver, Privacera Ranger, Ranger Usersync, Privacera Discovery, Ranger KMS, Crypto, PEG, and Privacera PolicySync. The passwords will be stored safely in keystores, instead of being exposed in plaintext.

By default, all the sensitive data of the Privacera services are encrypted.

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following command.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.encrypt.secrets.yml config/custom-vars/
    vi config/custom-vars/vars.encrypt.secrets.yml
    
  3. In this file set values for the following:

    Enter a password for the keystore that will hold all the secrets. e.g. Str0ngP@ssw0rd

    GLOBAL_DEFAULT_SECRETS_KEYSTORE_PASSWORD:"<PLEASE_CHANGE>"

    If you want to encrypt data of a Privacera service, you can enter the name of the property.

    Examples

    To encrypt properties used by Privacera Portal:

    PORTAL_ADD_ENCRYPT_PROPS_LIST:-PRIVACERA_PORTAL_DATASOURCE_URL-PRIVACERA_PORTAL_DATASOURCE_USERNAME
    

    To encrypt properties used by Dataserver:

    DATASERVER_ADD_ENCRYPT_PROPS_LIST:-DATASERVER_MAC_ALGORITHM

    To encrypt properties used by Encryption:

    >#Additional properties to be encrypted for Crypto
    CRYPTO_ENCRYPT_PROPS_LIST:-

    To

  4. Run the following command.

    >./privacera-manager.sh update
    

    For a Kubernetes configuration, you also need to run the following command:

    ./privacera-manager.sh restart
    
  5. To check keystores generated for the respective services.

    ls ~/privacera/privacera-manager/config/keystores