- Platform Release 6.5
- Privacera Platform Installation
- Privacera Platform User Guide
- Privacera Discovery User Guide
- Privacera Encryption Guide
- Privacera Access Management User Guide
- AWS User Guide
- Overview of Privacera on AWS
- Configure policies for AWS services
- Using Athena with data access server
- Using DynamoDB with data access server
- Databricks access manager policy
- Accessing Kinesis with data access server
- Accessing Firehose with Data Access Server
- EMR user guide
- AWS S3 bucket encryption
- Getting started with Minio
- Plugins
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- Privacera Platform documentation changelog
Component services configurations
Access Management
Data Server
AWS
AWS Data Server
Configure Privacera Data Access Server
This section covers how you can configure Privacera Data Access Server.
CLI Configuration Steps
SSH to the instance where Privacera Manager is installed.
Run the following command.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.dataserver.aws.yml config/custom-vars/
Edit the properties. For property details and description, refer to the Configuration properties below.
vi config/custom-vars/vars.dataserver.aws.yml
Note
Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, click here.
Run Privacera Manager update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
DATASERVER_RANGER_AUTH_ENABLED | Enable/disable Ranger authorization in DataServer. | |
DATASERVER_V2_WORKDER_THREADS | Number of worker threads to process inbound connection. | 20 |
DATASERVER_V2_CHANNEL_CONNECTION_BACKLOG | Maximum queue size for inbound connection. | 128 |
DATASERVER_V2_CHANNEL_CONNECTION_POOL | Enable connection pool for outbound request. The property is disabled by default. | |
DATASERVER_V2_FRONT_CHANNEL_IDLE_TIMEOUT | Idle timeout for inbound connection. | 60 |
DATASERVER_V2_BACK_CHANNEL_IDLE_TIMEOUT | Idle timeout for outbound connection and will take effect only if the connection pool enabled. | 60 |
DATASERVER_HEAP_MIN_MEMORY_MB | Add the minimum Java Heap memory in MB used by Dataserver. | 1024 |
DATASERVER_HEAP_MAX_MEMORY_MB | Add the maximum Java Heap memory in MB used by Dataserver. | 1024 |
DATASERVER_USE_REGIONAL_ENDPOINT | Set this property to enforce default region for all S3 buckets. | true |
DATASERVER_AWS_REGION | Default AWS region for S3 bucket. | us-east-1 |
AWS S3 data server
This section covers how you can configure access control for AWS S3 through Privacera Data Access Server.
Prerequisites
Ensure that the following prerequisites are met:
Create and add an AWS IAM Policy defined to allow access to S3 resources.
Follow AWS IAM Create and Attach Policy instructions, using either "Full S3 Access" or "Limited S3 Access" policy templates, depending on your enterprise requirements.
Return to this section once the Policy is attached to the Privacera Manager Host VM.
CLI configuration
SSH to the instance where Privacera Manager is installed.
Configure Privacera Data Server.
Edit the properties. For property details and description, refer to the Configuration Properties below.
vi config/custom-vars/vars.dataserver.aws.yml
Note
In Kubernetes environment, enable
DATASERVER_USE_POD_IAM_ROLE
andDATASERVER_IAM_POLICY_ARN
for using a specific IAM role for Dataserver pod. For property details and description, see S3 properties.You can also add custom properties that are not included by default. See Dataserver.
Run Privacera Manager update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
DATASERVER_USE_POD_IAM_ROLE | Property to enable the creation of an IAM role that will be used for the Dataserver pod. | true |
DATASERVER_IAM_POLICY_ARN | Full IAM policy ARN which needs to be attached to the IAM role associated with the Dataserver pod. | arn:aws:iam::aws:policy/AmazonS3FullAccess |
DATASERVER_USE_IAM_ROLE | If you've given permission to an IAM role to access the bucket, enable **Use IAM Roles**. | |
DATASERVER_S3_AWS_API_KEY | If you've used a access to access the bucket, disable **Use IAM Role**, and set the AWS API Key. | AKIAIOSFODNN7EXAMPLE |
DATASERVER_S3_AWS_SECRET_KEY | If you've used a secret key to access the bucket, disable **Use IAM Role**, and set the AWS Secret Key. | wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
DATASERVER_V2_S3_ENDPOINT_ENABLE | Enable to use a custom S3 endpoint. | |
DATASERVER_V2_S3_ENDPOINT_SSL | Property to enable/disable, if SSL is enabled/disabled on the MinIO server. | |
DATASERVER_V2_S3_ENDPOINT_HOST | Add the endpoint server host. | 192.468.12.142 |
DATASERVER_V2_S3_ENDPOINT_PORT | Add the endpoint server port. | 9000 |
DATASERVER_AWS_REQUEST_INCLUDE_USERINFO | Property to enable adding session role in CloudWatch logs for requests going via Dataserver. This will be available with the **privacera-user** key in the Request Params of CloudWatch logs. Set to true, if you want to see the **privacera-user** in CloudWatch. | true |
AWS Athena data server
This section covers how you can configure access control for AWS Athena through Privacera Data Access Server.
Prerequisites
Ensure the following:
Create and add an AWS IAM Policy defined to allow rights to use Athena and Glue resources and databases.
Follow AWS IAM Create and Attach Policy instructions, using the "Athena Access" policy modified as necessary for your enterprise. Return to this section once the Policy is attached to the Privacera Manager Host VM.
CLI configuration
SSH to the instance where Privacera Manager is installed.
Configure Privacera Data Server.
Edit the properties. For property details and description, refer to the Configuration Properties below.
vi config/custom-vars/vars.dataserver.aws.yml
Note
Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, click here.
Run Privacera Manager update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Identify an existing S3 bucket or create one to store the Athena query results.
AWS_ATHENA_RESULT_STORAGE_URL: "s3://${S3_BUCKET_FOR_QUERY_RESULTS}/athena-query-results/index.html"
Azure
Azure ADLS Data Server
This topic covers integration of Azure Data Lake Storage (ADLS) with the Privacera Platform using Privacera Data Access Server.
Prerequisites
Ensure that the following prerequisites are met:
You have access to an Azure Storage account along with required credentials.
For more information on how to set up an Azure storage account, refer to Azure Storage Account Creation.
Get the values for the following Azure properties: Application (client) ID, Client secrets
CLI Configuration
Go to the privacera-manager folder in your virtual machine. Open the config folder, copy the sample vars.dataserver.azure.yml file to the custom-vars/ folder, and edit it.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.dataserver.azure.yml config/custom-vars/ vi custom-vars/vars.dataserver.azure.yml
Edit the Azure-related information. For property details and description, click here.
If you want to use Azure CLI, use the following properties:
ENABLE_AZURE_CLI: "true" AZURE_GEN2_SHARED_KEY_AUTH: "true" AZURE_ACCOUNT_NAME: "<PLEASE_CHANGE>" AZURE_SHARED_KEY: "<PLEASE_CHANGE>"
If you want to access multiple Azure storage accounts with shared key authentication, use the following properties:
AZURE_GEN2_SHARED_KEY_AUTH: "true" AZURE_ACCT_SHARED_KEY_PAIRS: "<PLEASE_CHANGE>"
Note
Configuring
AZURE_GEN2_SHARED_KEY_AUTH
property allows you to access the resources in the Azure accounts only through the File Explorer in Privacera Portal.If you want to access multiple azure storage accounts with OAuth application based authentication, use the following property:
AZURE_GEN2_SHARED_KEY_AUTH: "false" AZURE_TENANTID: "<PLEASE_CHANGE>" AZURE_SUBSCRIPTION_ID: "<PLEASE_CHANGE>" AZURE_RESOURCE_GROUP: "<PLEASE_CHANGE>" DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST: - index: 0 clientId: "<PLEASE_CHANGE>" clientSecret: "<PLEASE_CHANGE>" storageAccName: "<PLEASE_CHANGE>"
Note
Configuring
AZURE_GEN2_SHARED_KEY_AUTH
property allows you to access the resources in the Azure accounts only through the File Explorer in Privacera Portal.Note
You can also add custom properties that are not included by default. See Dataserver.
Run the following command.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration Properties
Property Name | Description | Example |
---|---|---|
ENABLE_AZURE_CLI | Uncomment to use Azure CLI. The | true |
AZURE_GEN2_SHARED_KEY_AUTH | For To use multiple Azure storage accounts with shared key authentication, then set this property to To use multiple Azure storage accounts with OAuth authentication, then set this property to | true |
AZURE_ACCOUNT_NAME | Azure ADLS storage account name | company-qa-dept |
AZURE_SHARED_KEY | Azure ADLS storage account shared access key | =0Ty4br:2BIasz>rXm{cqtP8hA;7|TgZZZuTHJTg40z8E5z4UJ':roeJy=d7*/W" |
AZURE_ACCT_SHARED_KEY_PAIRS | Comma-separated multiple storage account names and its shared keys. The format must be ${storage_account_name_1}:${secret_key_1},${storage_account_name_2}:${secret_key_2} | accA:sharedKeyA, accB:sharedKeyB |
AZURE_TENANTID | To get the value for this property, Go to Azure portal > Azure Active Directory > Propertie > Tenant ID | 5a5cxxx-xxxx-xxxx-xxxx-c3172b33xxxx |
AZURE_APP_CLIENT_ID | Get the value by following the Pre-requisites section above. | 8c08xxxx-xxxx-xxxx-xxxx-6w0c95v0xxxx |
AZURE_SUBSCRIPTION_ID | To get the value for this property, Go to Azure portal > Select Subscriptions in the left sidebar > Select whichever subscription is needed > Click on overview > Copy the Subscription ID | 27e8xxxx-xxxx-xxxx-xxxx-c716258wxxxx |
AZURE_RESOURCE_GROUP | To get the value for this property, Go to Azure portal > Storage accounts > Select the storage account you want to configure >Click on Overview > Resource Group | privacera |
DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST: - index: 0 clientId: "<PLEASE_CHANGE>" clientSecret: "<PLEASE_CHANGE>" storageAccName: "<PLEASE_CHANGE>" | Configure multiple OAuth Azure applications and the storage accounts mapped with the configured client id. **Note**: The ‘clientSecret’ property must be in BASE64 format in the YAML file. | DATASERVER_AZURE_APP_CLIENT_CONFIG_LIST: - index: 0 clientId: "8c08xxxx-xxxx-xxxx-xxxx-6w0c95v0xxxx" clientSecret: "WncwSaMpleRZ1ZoLThJYWpZd3YzMkFJNEljZGdVN0FfVAo=" storageAccName: "storageAccA,storageAccB" - index: 1 clientId: "5d37xxxx-xxxx-xxxx-xxxx-7z0cu7e0xxxx" clientSecret: "ZncwSaMpleRZ1ZoLThJYWpZd3YzMkFJNEljZGdVN0FfVAo=" storageAccName: "storageAccC" |
Validation
All-access or attempted access (Allowed and Denied) for Azure ADLS resources will now be recorded to the audit stream. This Audit stream can be reviewed in the Audit page of the Privacera Access Manager. Default access for a data repository is 'Denied' so all data access will be denied.
To verify Privacera Data Management control, perform the following steps:
Login to Privacera Portal, as a portal administrator, open Data Inventory: Data Explorer, and attempt to view the targeted ADLS files or folders. The data will be hidden and a Denied status will be registered in the Audit page.
In Privacera Portal, open Access Management: Resource Policies. Open System 'ADLS' and 'application' (data repository) 'privacera_adls'. Create or modify an access policy to allow access to some or all of your ADLS storage.
Return to Data Inventory: Data Explorer and re-attempt to view the data as allowed by your new policy or policy change. Repeat step 1.
You should be able to view files or folders in the account, and an Allowed status will be registered in the Audit page.
To check the log in the Audit page in Privacera Portal, perform the following steps:
On the Privacera Portal page, expand Access Management and click the Auditfrom the left menu.
The Audit page will be displayed with Ranger Audit details.
GCP Data Server
This topic covers integration of Google Cloud Storage (GCS) and Google BigQuery (GBQ) with the Privacera Platform using Privacera Data Access Server.
Prerequisites
Ensure that the following prerequisites are met:
If GCS is being configured, then you need access to an Google Cloud Storage account along with required credentials.
If GBQ is being configured, then you need access to an Google Cloud BigQuery account along with required credentials.
Get the credential file (JSON) associated with the service account by downloading it.
CLI Configuration
SSH to the instance where Privacera is installed.
Copy the credential file you've downloaded from your machine to a location on your instance where Privacera Manager is configured. Get the file path of the JSON file and add it in the next step.
Run the following commands.
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.dataserver.gcp.yml config/custom-vars/ vi config/custom-vars/vars.dataserver.gcp.yml
Update the following credential file information.
GCP_CREDENTIAL_FILE_PATH: "/tmp/my_google_credential.json"
Note
You can also add custom properties that are not included by default. See Dataserver.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is completed, Privacera gets installed and a default GCS data source is created.
Add GCS Project ID in the GCS data source.
Navigate to Portal UI > Settings > Data Source Registration and edit
GOOGLE_CLOUD_STORAGE
.Click Application Properties and add the following properties:
Credential Type: Select Google Credentials Local File Path from the dropdown list.
Google Credentials Local File Path: Set value to None.
Google Project Id: Enter your Google Project ID.
To view the buckets, navigate to Data Inventory > File Explorer.
If you can not view the buckets, restart Dataserver.
cd privacera/privacera-manager ./privacera-manager.sh restart dataserver
Tip
You can use Google APIs to apply access control on GCS. For more information, click here.
UserSync
Privacera UserSync
Privacera Data Access User Synchronization
Learn how you can synchronize users and groups from different connectors.
LDAP
Run the following command to enable Privacera UserSync:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
Enable the LDAP connector:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.ldap.yml config/custom-vars/ vi config/custom-vars/vars.privacera-usersync.ldap.yml
Edit the following properties:
Property
Description
Example
A) LDAP Connector Info
LDAP_CONNECTOR
Name of the connector.
ad
LDAP_ENABLED
Enabled status of connector:
true
orfalse
true
LDAP_SERVICE_TYPE
Set a service type:
ldap
orad
ad
LDAP_DATASOURCE_NAME
Name of the datasource:
ldap
orad
ad
LDAP_URL
URL of source LDAP.
ldap://example.us:389
LDAP_BIND_DN
Property is used to connect to LDAP and then query for users and groups.
CN=Example User,OU=sales,DC=ad,DC=sales,DC=us
LDAP_BIND_PASSWORD
LDAP bind password for the bind DN specified above.
LDAP_AUTH_TYPE
Authentication type, the default is
simple
simple
LDAP_REFERRAL
Set the LDAP context referral:
ignore
orfollow
.Default value is
follow
.follow
LDAP_SYNC_INTERVAL
Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.
3600
B) Enable SSL for LDAP Server
Note
Support Chain SSL - Preview Functionality
Previously Privacera services were only using one SSL certificate of LDAP server even if a chain of certificates was available. Now as a Preview functionality, all the certificates which are available in the chain certificate are imported it into the truststore. This is added for Privacera usersync, Ranger usersync and portal SSL certificates.
PRIVACERA_USERSYNC_SYNC_LDAP_SSL_ENABLED
Set this property to enable/disable SSL for Privacera Usersync.
true
PRIVACERA_USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS
Set this property if you want Privacera Manager to generate a truststore for your SSL-enabled LDAP server.
true
PRIVACERA_USERSYNC_AUTH_SSL_ENABLED
Set this property if the other Privacera services are not SSL enabled and you are using SSL-enabled LDAP server.
true
C) LDAP Search
LDAP_SEARCH_GROUP_FIRST
Property to enable to search for groups first, before searching for users.
true
LDAP_SEARCH_BASE
Search base for users and groups.
DC=ad,DC=sales,DC=us
LDAP_SEARCH_USER_BASE
Search base for users.
ou=example,dc=ad,dc=sales,dc=us
LDAP_SEARCH_USER_SCOPE
Set the value for search scope for the users:
base
,one
orsub
.Default value is
sub
.sub
LDAP_SEARCH_USER_FILTER
Optional additional filter constraining the users selected for syncing.
LDAP_SEARCH_USER_GROUPONLY
Boolean to only load users in groups.
false
LDAP_ATTRIBUTE_ONLY
Sync only the attributes of users already synced from other services.
false
LDAP_SEARCH_INCREMENTAL_ENABLED
Enable incremental search. Syncing changes only since last search.
false
LDAP_PAGED_RESULTS_ENABLED
Enable paged results control for LDAP Searches. Default is
true
.true
LDAP_PAGED_CONTROL_CRITICAL
Set paged results control criticality to CRITICAL. Default is
true
.true
LDAP_SEARCH_GROUP_BASE
Search base for groups.
ou=example,dc=ad,dc=sales,dc=us
LDAP_SEARCH_GROUP_SCOPE
Set the value for search scope for the groups:
base
,one
orsub
.Default value is
sub
.sub
LDAP_SEARCH_GROUP_FILTER
Optional additional filter constraining the groups selected for syncing.
LDAP_SEARCH_CYCLES_BETWEEN_DELETED_DETECTION
Numeric number of cycles between deleted searches. Default value is
6
.6
LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS
Enables both user and group deleted searches. Default is
false
.false
LDAP_SEARCH_DETECT_DELETED_USERS
Override setting for user deleted search. Default value is
LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS
.LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS
LDAP_SEARCH_DETECT_DELETED_GROUPS
Override setting for group deleted search. Default value is
LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS
.LDAP_SEARCH_DETECT_DELETED_USERS_GROUPS
D) LDAP Manage/Ignore List of Users/Groups
LDAP_MANAGE_USER_LIST
List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.
LDAP_IGNORE_USER_LIST
List of users to ignore from sync results.
LDAP_MANAGE_GROUP_LIST
List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.
LDAP_IGNORE_GROUP_LIST
List of groups to ignore from sync results.
E) LDAP Object Users/Groups Class
LDAP_OBJECT_USER_CLASS
Objectclass to identify user entries.
user
LDAP_OBJECT_GROUP_CLASS
Objectclass to identify group entries.
group
F) LDAP User/Group Attributes
LDAP_ATTRIBUTE_USERNAME
Attribute from user entry that would be treated as user name.
SAMAccountName
LDAP_ATTRIBUTE_FIRSTNAME
Attribute of a user’s first name. The default is
givenName
.givenName
LDAP_ATTRIBUTE_LASTNAME
Attribute of a user’s last name.
LDAP_ATTRIBUTE_EMAIL
Attribute from user entry that would be treated as email address.
mail
LDAP_ATTRIBUTE_GROUPNAMES
List of attributes from group entry that would be treated as group name.
LDAP_ATTRIBUTE_GROUPNAME
Attribute from group entry that would be treated as group name.
name
LDAP_ATTRIBUTE_GROUP_MEMBER
Attribute from group entry that is list of members.
member
G) Username/Group name Attribute Modification
LDAP_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL
Extract username from an email address. (e.g. username@domain.com -> username) Default is false.
false
LDAP_ATTRIBUTE_USERNAME_VALUE_PREFIX
Prefix to prepend to the username. Default is blank.
LDAP_ATTRIBUTE_USERNAME_VALUE_POSTFIX
Postfix to append pend to the username. Default is blank.
LDAP_ATTRIBUTE_USERNAME_VALUE_TOLOWER
Convert the username to lowercase. Default is false.
false
LDAP_ATTRIBUTE_USERNAME_VALUE_TOUPPER
Convert the username to uppercase. Default is false.
false
LDAP_ATTRIBUTE_USERNAME_VALUE_REGEX
Attribute to replace username to matching regex. Default is blank.
LDAP_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL
Extract the group name from an email address. Default is false.
false
LDAP_ATTRIBUTE_GROUPNAME_VALUE_PREFIX
Prefix to prepend to the group's name. Default is blank.
LDAP_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX
Postfix to append pend to the group's name. Default is blank.
LDAP_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER
Convert the name to group's name to lower case. Default is false.
false
LDAP_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER
Convert the group's name to uppercase. Default is false.
false
LDAP_ATTRIBUTE_GROUPNAME_VALUE_REGEX
Attribute to replace the group's name to matching regex. Default is blank.
H) Group Attribute Configuration
LDAP_GROUP_ATTRIBUTE_LIST
The list of attribute keys to get from synced groups.
LDAP_GROUP_ATTRIBUTE_VALUE_PREFIX
Append prefix to values of group attributes such as group name.
LDAP_GROUP_ATTRIBUTE_KEY_PREFIX
Append prefix to key of group attributes such as group name.
LDAP_GROUP_LEVELS
Configure Privacera usersync with AD/LDAP nested group membership.
Run the following command:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
LDAP/AD deleted entity detection
When enabled, LDAP/AD deleted entity detection will perform a soft delete of users or groups in Privacera Portal. A soft delete removes all memberships of the group/user and marks them as “hidden”. Hidden users will not appear in auto completion when modifying access policies. References to users/groups in policies will remain, until manually removed or the user/group is fully deleted from Privacera Portal. Hidden users can be fully deleted by using the Privacera Portal UI or REST APIs.
Properties:
Boolean:
usersync.connector.0.search.deleted.group.enabled
(default:false
)Boolean:
usersync.connector.0.search.deleted.user.enabled
(default:false
)Numeric:
usersync.connector.#.search.deleted.cycles
(default:6
)
Privacera Manager Variables:
In the LDAP connector properties table above, see under User Search (section C).
Azure Active Directory (AAD)
Run the following command to enable Privacera UserSync:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
Enable the AAD connector:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.azuread.yml config/custom-vars/ vi config/custom-vars/vars.privacera-usersync.azuread.yml
Edit the following properties:
Property
Description
Example
A) AAD Basic Info
AZURE_AD_CONNECTOR
Name of the connector.
AAD1
AZURE_AD_ENABLED
Enabled status of connector. (true/false)
true
AZURE_AD_SERVICE_TYPE
Service Type
AZURE_AD_DATASOURCE_NAME
Name of the datasource.
AZURE_AD_ATTRIBUTE_ONLY
Sync only the attributes of users already synced from other services.
false
AZURE_AD_SYNC_INTERVAL
Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.
3600
B) Azure AAD Info: (Get the following information from Azure Portal)
AZURE_AD_TENANT_ID
Azure Active Directory Id (Tenant ID)
1a2b3c4d-azyd-4755-9638-e12xa34p56le
AZURE_AD_CLIENT_ID
Azure Active Directory application client ID which will be used for accessing Microsoft Graph API.
11111111-1111-1111-1111-111111111111
AZURE_AD_CLIENT_SECRET
Azure Active Directory application client secret which will be used for accessing Microsoft Graph API.
AZURE_AD_USERNAME
Azure Account username which will be used for getting access token to be used on behalf of Azure AD application.
AZURE_AD_PASSWORD
Azure Account password which will be used for getting access token to be used on behalf of Azure AD application.
C) AAD Manage/Ignore List of Users/Groups
AZURE_AD_MANAGER_USER_LIST
List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.
AZURE_AD_IGNORE_USER_LIST
List of users to ignore from sync results.
AZURE_AD_MANAGE_GROUP_LIST
List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.
AZURE_AD_IGNORE_GROUP_LIST
List of groups to ignore from sync results.
D) AAD Search
AZURE_AD_SEARCH_SCOPE
Azure AD Application Access Scope
AZURE_AD_SEARCH_USER_GROUPONLY
Boolean to only load users in groups.
false
AZURE_AD_SEARCH_INCREMENTAL_ENABLED
Enable incremental search. Syncing only changes since last search.
false
AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS
Enables both user and group deleted searches. Default is
false
.false
AZURE_AD_SEARCH_DETECT_DELETED_USERS
Override setting for user deleted search. Default value is
AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS
.AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS
AZURE_AD_SEARCH_DETECT_DELETED_GROUPS
Override setting for group deleted search. Default value is
AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS
.AZURE_AD_SEARCH_DETECT_DELETED_USERS_GROUPS
E) Azure Service Principal
Note
If Sync Service Principals as Users is enabled, AAD does not require that
displayName
of a Service Principal be a unique value. In this case a different attribute (such asappId
) should be used as the Service Principal Username.AZURE_AD_SERVICEPRINCIPAL_ENABLED
Sync Azure service principal to ranger user entity.
false
AZURE_AD_SERVICEPRINCIPAL_USERNAME
Properties to specify from which key to get values of username in case service principal is mapped to Ranger user entity.
displayName
F) AAD User/Group Attributes
AZURE_AD_ATTRIBUTE_USERNAME
Attribute of a user’s name (default: userPrincipalName)
AZURE_AD_ATTRIBUTE_FIRSTNAME
Attribute of a user’s first name (default: givenName)
AZURE_AD_ATTRIBUTE_LASTNAME
Attribute of a user’s last name (default: surname)
AZURE_AD_ATTRIBUTE_EMAIL
Attribute from user entry that would be treated as email address.
AZURE_AD_ATTRIBUTE_GROUPNAME
Attribute from group entry that would be treated as group name.
AZURE_AD_SERVICEPRINCIPAL_USERNAME
Attribute of service principal name.
G) Username/Group name Attribute Modification
AZURE_AD_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL
Extract username from an email address. (e.g. username@domain.com -> username) Default is false.
false
AZURE_AD_ATTRIBUTE_USERNAME_VALUE_PREFIX
Prefix to prepend to the username. Default is blank.
AZURE_AD_ATTRIBUTE_USERNAME_VALUE_POSTFIX
Postfix to append pend to the username. Default is blank.
AZURE_AD_ATTRIBUTE_USERNAME_VALUE_TOLOWER
Convert the username to lowercase. Default is false.
false
AZURE_AD_ATTRIBUTE_USERNAME_VALUE_TOUPPER
Convert the username to uppercase. Default is false.
false
AZURE_AD_ATTRIBUTE_USERNAME_VALUE_REGEX
Attribute to replace username to matching regex. Default is blank.
AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL
Extract the group name from an email address. Default is false.
false
AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_PREFIX
Prefix to prepend to the group's name. Default is blank.
AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX
Postfix to append pend to the group's name. Default is blank.
AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER
Convert the name to group's name to lower case. Default is false.
false
AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER
Convert the group's name to uppercase. Default is false.
false
AZURE_AD_ATTRIBUTE_GROUPNAME_VALUE_REGEX
Attribute to replace the group's name to matching regex. Default is blank.
H) Group Attribute Configuration
AZURE_AD_GROUP_ATTRIBUTE_LIST
The list of attribute keys to get from synced groups.
AZURE_AD_GROUP_ATTRIBUTE_VALUE_PREFIX
Append prefix to values of group attributes such as group name.
AZURE_AD_GROUP_ATTRIBUTE_KEY_PREFIX
Append prefix to key of group attributes such as group name.
I) Filter Properties
AZURE_AD_FILTER_USER_LIST
Filter the AAD user list, supported for non-incremental search. When incremental search is enabled delta search does not support filter properties.
abc.def@privacera.com
AZURE_AD_FILTER_SERVICEPRINCIPAL_LIST
Filter the AAD service principal list, supported for non-incremental search. When incremental search is enabled delta search does not support filter properties.
abc-testapp
AZURE_AD_FILTER_GROUP_LIST
Filter the AAD group list, supported for non-incremental search. When incremental search is enabled delta search does not support filter properties.
PRIVACERA-AB-GROUP-00
J) Domain Properties
AZURE_AD_MANAGE_DOMAIN_LIST
Only users in manage domain list will be synced.
Privacera.US
AZURE_AD_IGNORE_DOMAIN_LIST
Users in ignore domain list will not be synced.
Privacera.US
AZURE_AD_DOMAIN_ATTRIBUTE
Specify the attribute from which you want to compare user domain, email or username are supported. Default is email.
username
Run the following command:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Azure Active Directory (AAD) deleted entity detection
When enabled, AAD deleted entity detection will perform a soft delete of users or groups in Privacera Portal. A soft delete removes all memberships of the group/user and marks them as “hidden”. Hidden users will not appear in auto completion when modifying access policies. References to users/groups in policies will remain, until manually removed or the user/group is fully deleted from Privacera Portal. Hidden users can be fully deleted by using the Privacera Portal UI or REST APIs.
Properties:
Boolean:
usersync.connector.3.search.deleted.group.enabled
(default:false
)Boolean:
usersync.connector.3.search.deleted.user.enabled
(default:false
)
Privacera Manager Variables:
In the AAD connector properties table above, see under AAD Search (section D).
SCIM
Run the following command to enable Privacera UserSync:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
Enable the SCIM connector:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.scim.yml config/custom-vars/ vi config/custom-vars/vars.privacera-usersync.scim.yml
Edit the following properties:
Property
Description
Example
A) SCIM Connector Info
SCIM_CONNECTOR
Name of connector.
DB1
SCIM_ENABLED
Enabled status of connector. (true/false)
true
SCIM_SERVICETYPE
Service Type
scim
SCIM_DATASOURCE_NAME
Name of the datasource.
databricks1
SCIM_URL
Connector URL
ADMIN_USER_BEARER_TOKEN
Bearer token
SCIM_SYNC_INTERVAL
Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.
3600
B) SCIM Manage/Ignore List of Users/Groups
SCIM_MANAGE_USER_LIST
List of users to manage from sync results. If this list is defined, all users not on this list will be ignored
SCIM_IGNORE_USER_LIST
List of users to ignore from sync results.
SCIM_MANAGE_GROUP_LIST
List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.
SCIM_IGNORE_GROUP_LIST
List of groups to ignore from sync results.
C) SCIM User/Group Attributes
SCIM_ATTRIBUTE_USERNAME
Attribute from user entry that would be treated as user name.
userName
SCIM_ATTRIBUTE_FIRSTNAME
Attribute from user entry that would be treated as firstname.
name.givenName
SCIM_ATTRIBUTE_LASTNAME
Attribute from user entry that would be treated as lastname.
name.familyName
SCIM_ATTRIBUTE_EMAIL
Attribute from user entry that would be treated as email address.
emails[primary-true].value
SCIM_ATTRIBUTE_ONLY
Sync only the attributes of users already synced from other services. (true/false)
false
SCIM_ATTRIBUTE_GROUPS
Attribute of user’s group list.
groups
SCIM_ATTRIBUTE_GROUPNAME
Attribute from group entry that would be treated as group name.
displayName
SCIM_ATTRIBUTE_GROUP_MEMBER
Attribute from group entry that is list of members.
members
D) SCIM Server Username Attribute Modifications
SCIM_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL
Extract the user’s username from an email address. (e.g. username@domain.com -> username) The default is false.
false
SCIM_ATTRIBUTE_USERNAME_VALUE_PREFIX
Prefix to prepend to username. The default is blank.
SCIM_ATTRIBUTE_USERNAME_VALUE_POSTFIX
Postfix to append to the username. The default is blank.
SCIM_ATTRIBUTE_USERNAME_VALUE_TOLOWER
Convert the user’s username to lowercase. The default is false.
false
SCIM_ATTRIBUTE_USERNAME_VALUE_TOUPPER
Convert the user’s username to uppercase. The default is false.
false
SCIM_ATTRIBUTE_USERNAME_VALUE_REGEX
Attribute to replace username to matching regex. The default is blank.
E) SCIM Server Group Name Attribute Modifications
SCIM_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL
Extract the group’s name from an email address (e.g. groupname@domain.com -> groupname). The default is false.
false
SCIM_ATTRIBUTE_GROUPNAME_VALUE_PREFIX
Prefix to prepend to the group's name. The default is blank.
SCIM_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX
Postfix to append to the group's name. The default is blank.
SCIM_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER
Convert group's name to lowercase. The default is false.
false
SCIM_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER
Convert the group's name to uppercase. The default is false.
false
SCIM_ATTRIBUTE_GROUPNAME_VALUE_REGEX
Attribute to replace group's name to matching regex. The default is blank.
F) Group Attribute Configuration
SCIM_GROUP_ATTRIBUTE_LIST
The list of attribute keys to get from synced groups.
SCIM_GROUP_ATTRIBUTE_VALUE_PREFIX
Append prefix to values of group attributes such as group name.
SCIM_GROUP_ATTRIBUTE_KEY_PREFIX
Append prefix to key of group attributes such as group name.
Run the following command:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
SCIM Server
Note
SCIM Server exposes privacera-usersync
service externally on a Public/Internet-facing LB.
Run the following command to enable Privacera UserSync:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
Enable the SCIM Server connector:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.scimserver.yml config/custom-vars/ vi config/custom-vars/vars.privacera-usersync.scimserver.yml
Edit the following properties:
Property
Description
Example
A) SCIM Server Connector Info
SCIM_SERVER_CONNECTOR
Identifying name of this connector.
DB1
SCIM_SERVER_ENABLED
Enabled status of connector. (true/false)
true
SCIM_SERVER_SERVICETYPE
Type of service/connector.
scimserver
SCIM_SERVER_DATASOURCE_NAME
Unique datasource name. Used for identifying source of data and configuring priority list. (Optional)
databricks1
SCIM_SERVER_ATTRIBUTE_ONLY
Sync only the attributes of users already synced from other services. (true/false)
SCIM_SERVER_BEARER_TOKEN
Bearer token for auth to SCIM API. When set, SCIM requests with this token will be allowed access.
SCIM_SERVER_USERNAME
Basic auth username, when set SCIM requests with this username will be allowed access. (Password also required)
SCIM_SERVER_PASSWORD
Basic auth password, when set SCIM requests with this password will be allowed access. (Username also required)
SCIM_SERVER_SYNC_INTERVAL
Frequency of usersync audit records in seconds. Default value is 3600, minimum value is 300.
3600
B) SCIM Server Manage/Ignore List of Users/Groups
SCIM_SERVER_MANAGE_USER_LIST
List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.
SCIM_SERVER_IGNORE_USER_LIST
List of users to ignore from sync results.
SCIM_SERVER_MANAGE_GROUP_LIST
List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.
SCIM_SERVER_IGNORE_GROUP_LIST
List of groups to ignore from sync results.
C) SCIM Server Attributes
SCIM_SERVER_ATTRIBUTE_USERNAME
Attribute of a user's name.
userName
SCIM_SERVER_ATTRIBUTE_FIRSTNAME
Attribute of a user's first name.
name.givenName
SCIM_SERVER_ATTRIBUTE_LASTNAME
Attribute of a user's last/family name.
name.familyName
SCIM_SERVER_ATTRIBUTE_EMAIL
Attribute of a user’s email.
emails[primary-true].value
SCIM_SERVER_ATTRIBUTE_GROUPS
Attribute of a user’s group list.
groups
SCIM_SERVER_ATTRIBUTE_GROUPNAME
Attribute of a group's name.
displayName
SCIM_SERVER_ATTRIBUTE_GROUP_MEMBER
Attribute from group entry that is the list of members.
members
D) SCIM Server Username Attribute Modifications
SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL
Extract the user’s username from an email address. (e.g. username@domain.com -> username) The default is false.
false
SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_PREFIX
Prefix to prepend to username. The default is blank.
SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_POSTFIX
Postfix to append to the username. The default is blank.
SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_TOLOWER
Convert the user’s username to lowercase. The default is false.
false
SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_TOUPPER
Convert the user’s username to uppercase. The default is false.
false
SCIM_SERVER_ATTRIBUTE_USERNAME_VALUE_REGEX
Attribute to replace username to matching regex. The default is blank.
E) SCIM Server Group Name Attribute Modifications
SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL
Extract the group’s name from an email address (e.g. groupname@domain.com -> groupname). The default is false.
false
SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_PREFIX
Prefix to prepend to the group's name. The default is blank.
SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX
Postfix to append to the group's name. The default is blank.
SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER
Convert group's name to lowercase. The default is false.
false
SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER
Convert the group's name to uppercase. The default is false.
false
SCIM_SERVER_ATTRIBUTE_GROUPNAME_VALUE_REGEX
Attribute to replace group's name to matching regex. The default is blank.
F) Group Attribute Configuration
SCIM_SERVER_GROUP_ATTRIBUTE_LIST
The list of attribute keys to get from synced groups.
SCIM_SERVER_GROUP_ATTRIBUTE_VALUE_PREFIX
Append prefix to values of group attributes such as group name.
SCIM_SERVER_GROUP_ATTRIBUTE_KEY_PREFIX
Append prefix to key of group attributes such as group name.
If NGINX Ingress is Enabled, and NGINX controller is running on Internal LB, ensure to disable the ingress for Usersync so that it can pick a Public/Internet facing LB by adding the below variable:
vi config/custom-vars/vars.kubernetes.nginx-ingress.yml PRIVACERA_USERSYNC_K8S_NGINX_INGRESS_ENABLE: “false”
Run the following command:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
OKTA
Run the following command to enable Privacera UserSync:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.yml config/custom-vars/
Enable the OKTA connector:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.privacera-usersync.okta.yml config/custom-vars/ vi config/custom-vars/vars.privacera-usersync.okta.yml
Edit the following properties:
Property
Description
Example
A) OKTA Connector Info
OKTA_CONNECTOR
Name of the connector.
OKTA
OKTA_ENABLED
Enabled status of connector. (true/false)
true
OKTA_SERVICETYPE
Type of service/connector.
okta
OKTA_DATASOURCE_NAME
Unique datasource name, used for identifying source of data and configuring priority list. (Optional)
OKTA_SERVICE_URL
Connector URL
https://{myOktaDomain}.okta.com
OKTA_API_TOKEN
API token
A8b2c84d-895a-4fea-82dc-401397b8e50c
OKTA_SYNC_INTERVAL
Frequency of usersync pulls and audit records in seconds. Default value is 3600, minimum value is 300.
3600
B) OKTA Manage/Ignore List of Users/Groups
OKTA_USER_LIST
List of users to manage from sync results. If this list is defined, all users not on this list will be ignored.
OKTA_IGNORE_USER_LIST
List of users to ignore from sync results.
OKTA_USER_LIST_STATUS
List of users to manage with status as equal to:
STAGED
,PROVISIONED
,ACTIVE
,RECOVERY
,PASSWORD_EXPIRED
,LOCKED_OUT
orDEPROVISIONED
. If this list is defined, all users not on this list will be ignored.ACTIVE
,STAGED
OKTA_USER_LIST_LOGIN
List of users to manage with user login name (can contain ). If this list is defined, all users not on this list will be ignored.
sw;mon,san
OKTA_USER_LIST_PROFILE_FIRSTNAME
List of users to manage with user first name (can contain ). If this list is defined, all users not on this list will be ignored.
sw;mon,san
OKTA_USER_LIST_PROFILE_LASTNAME
List of users to manage with user last name (can contain ). If this list is defined, all users not on this list will be ignored.
sw;mon,san
OKTA_LIST_PROFILE_EMAIL
List of users to manage with user email (can contain ). If this list is defined, all users not on this list will be ignored.
sw;mon,san
OKTA_LIST_TYPE
List of groups to manage with group type. If this list is defined, all groups not on this list will be ignored.
APP_GROUP
,BUILT_IN
,OKTA_GROUP
OKTA_GROUP_LIST
List of groups to manage from sync results. If this list is defined, all groups not on this list will be ignored.
OKTA_IGNORE_GROUP_LIST
List of groups to ignore from sync results.
OKTA_GROUP_LIST_SOURCE_ID
List of groups to manage with group source id. If this list is defined, all groups not on this list will be ignored.
0oa2v0el0gP90aqjJ0g7,0oa2v0el0gP90aqjJ0g8,0oa2v0el0gP90aqjJ0g0
OKTA_GROUP_LIST_PROFILE_NAME
List of groups to manage with group name. If this list is defined, all groups not on this list will be ignored.
group1,testGroup,testGroup2
C) OKTA Search
OKTA_SEARCH_USER_GROUPONLY
Boolean to only load users in groups.
false
OKTA_SEARCH_INCREMENTAL_ENABLED
Boolean to enable incremental search, syncing only changes since last search.
false
D) OKTA User/Group Attributes
OKTA_ATTRIBUTE_USERNAME
Attribute from user entry that would be treated as user name.
login
OKTA_ATTRIBUTE_FIRSTNAME
Attribute from user entry that would be treated as firstname.
firstName
OKTA_ATTRIBUTE_LASTNAME
Attribute from user entry that would be treated as lastname.
lastName
OKTA_ATTRIBUTE_EMAIL
Attribute from user entry that would be treated as email address.
email
OKTA_ATTRIBUTE_GROUPS
Attribute of user’s group list.
groups
OKTA_ATTRIBUTE_GROUPNAME
Attribute of a group’s name.
name
OKTA_ATTRIBUTE_ONLY
Sync only the attributes of users already synced from other services. (true/false)
false
E) OKTA Username Attribute Modifications
OKTA_ATTRIBUTE_USERNAME_VALUE_EXTRACTFROMEMAIL
Extract the user’s username from an email address. (e.g. username@domain.com -> username) The default is false.
false
OKTA_ATTRIBUTE_USERNAME_VALUE_PREFIX
Prefix to prepend to username. The default is blank.
OKTA_ATTRIBUTE_USERNAME_VALUE_POSTFIX
Postfix to append to the username. The default is blank.
OKTA_ATTRIBUTE_USERNAME_VALUE_TOLOWER
Convert the user’s username to lowercase. The default is false.
false
OKTA_ATTRIBUTE_USERNAME_VALUE_TOUPPER
Convert the user’s username to uppercase. The default is false.
false
OKTA_ATTRIBUTE_USERNAME_VALUE_REGEX
Attribute to replace username to matching regex. The default is blank.
F) OKTA Group Name Attribute Modifications
OKTA_ATTRIBUTE_GROUPNAME_VALUE_EXTRACTFROMEMAIL
Extract the group’s name from an email address (e.g. groupname@domain.com -> groupname). The default is false.
false
OKTA_ATTRIBUTE_GROUPNAME_VALUE_PREFIX
Prefix to prepend to the group's name. The default is blank.
OKTA_ATTRIBUTE_GROUPNAME_VALUE_POSTFIX
Postfix to append to the group's name. The default is blank.
OKTA_ATTRIBUTE_GROUPNAME_VALUE_TOLOWER
Convert group's name to lowercase. The default is false.
false
OKTA_ATTRIBUTE_GROUPNAME_VALUE_TOUPPER
Convert the group's name to uppercase. The default is false.
false
OKTA_ATTRIBUTE_GROUPNAME_VALUE_REGEX
Attribute to replace group's name to matching regex. The default is blank.
Run the following command:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Privacera UserSync REST endpoints
When enabled, Privacera UserSync has REST API endpoints available to allow administrators to push users and groups that already exist in the UserSync cache to Privacera Portal.
Push users
POST - <UserSync_Host>:6086/api/pus/public/cache/load/users
The request body should contain a userList
and/or ConnectorList
. If no users and connectors are passed, all users will be pushed to Ranger.
Example request:
curl -X 'POST' \ '<UserSync_Host>:6086/api/pus/public/cache/load/users' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "userList": ["User1", "User2"], "connectorList": ["AAD1","OKTA"] }'
Parameter | Type | Description |
---|---|---|
| string array | List of users to be added to Privacera Portal. |
| string array | All users associated with provided connector(s) will be pushed. |
200 OK
404 Not Found
: If one or more Users or Connectors are not found, JSON response contains error message.
Push groups
POST - <UserSync_Host>:6086/api/pus/public/cache/load/groups
The request body should contain a groupList
and/or connectorList
. If no groups and connectors are passed, all users will be pushed to Ranger.
Example request:
curl -X 'POST' \ '<UserSync_Host>:6086/api/pus/public/cache/load/groups' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "groupList": ["Group1", "Group2"], "connectorList": ["AAD1","OKTA"] }'
Parameter | Type | Description |
---|---|---|
| string array | List of groups to be added to Privacera Portal. |
| string array | All groups associated with provided connector(s) will be pushed. |
200 OK
404 Not Found
: If one or more Groups or Connectors are not found, JSON response contains error message.
Migration from Apache Ranger UserSync to Privacera UserSync
Privacera generally recommends using its own version of UserSync (called Privacera UserSync) over the open-source Apache Ranger UserSync. Privacera has rewritten the Ranger UserSync to improve performance and features.
By default, all PrivaceraCloud customers are provisioned to use Privacera Usersync for improved performance capabilities and feature availability over Ranger UserSync. Below are the steps for platform customers to migrate.
All customers must migrate to use Privacera Usersync by March 31, 2024.
Migration steps
For Privacera Platform customers seeking to transition from Apache Ranger UserSync to Privacera UserSync, there are required manual steps to change the configuration.
Navigate to the privacera-manager/config/custom-vars folder.
cd privacera-manager/config/custom-vars
Rename the vars.usersync.ldaps.yml file to have a different extension (e.g. vars.usersync.ldaps.yml.bak).
Ensure that the Ranger UserSync POD/Image has stopped.
./privacera_manager.sh stop usersync
Copy the following files:
../sample-vars/vars.privacera-usersync.yml
../sample-vars/vars.privacera-usersync.ldap.yml
Edit the vars.privacera-usersync.ldap.yml file with the desired configurations.
Ranger UserSync Variable
Privacera UserSync Variable
USERSYNC_SYNC_LDAP_URL
LDAP_URL
USERSYNC_SYNC_LDAP_BIND_DN
LDAP_BIND_DN
USERSYNC_SYNC_LDAP_BIND_PASSWORD
LDAP_BIND_PASSWORD
USERSYNC_SYNC_LDAP_SEARCH_BASE
LDAP_SEARCH_BASE
USERSYNC_SYNC_LDAP_USER_SEARCH_BASE
LDAP_SEARCH_USER_BASE
USERSYNC_SYNC_LDAP_USER_SEARCH_FILTER
LDAP_SEARCH_USER_FILTER
USERSYNC_SYNC_GROUP_SEARCH_BASE
LDAP_SEARCH_GROUP_BASE
USERSYNC_SYNC_LDAP_GROUP_SEARCH_FILTER
LDAP_SEARCH_GROUP_FILTER
USERSYNC_SYNC_LDAP_OBJECT_CLASS
LDAP_OBJECT_USER_CLASS
USERSYNC_SYNC_GROUP_OBJECT_CLASS
LDAP_OBJECT_GROUP_CLASS
USERSYNC_SYNC_LDAP_SSL_ENABLED
PRIVACERA_USERSYNC_SYNC_LDAP_SSL_ENABLED
USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS
PRIVACERA_USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS
Run PM update to deploy Privacera-UserSync:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
For more information, see Privacera UserSync.
LDAP/LDAP-S
LDAP / LDAP-S
This topic covers how you can configure the Privacera Platform to attach and import users and groups defined in an external Active Directory (AD), LDAP, or LDAPS (LDAP over SSL)) directory as data access users and groups.
Prerequisites
Before starting these steps, prepare the following. You need to configure various Privacera properties with these values, as detailed in Configuration.
Determine the following LDAP values:
The FQDN and protocol (http or https) of your LDAP server
DN
Complete Bind DN
Bind DN password
Top-level search base
User search base
To configure an SSL-enabled LDAP-S server, Privacera requires an SSL certificate. You have these alternatives:
Set the Privacera property
USERSYNC_SYNC_LDAP_SSL_ENABLED: "true"
.Allow Privacera Manager to download and create the certificate based on the LDAP-S server URL. Set the Privacera property
USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS: "true"
.Manually configure a truststore on the Privacera server that contains the certificate of the LDAP-S server. Set the Privacera property
USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS: "false"
.
Configuration
SSH to instance as ${USER}.
Run the following commands. See Access Manager LDAP-related properties and descriptions.
USERSYNC_SYNC_LDAP_URL: "<PLEASE_CHANGE>" USERSYNC_SYNC_LDAP_BIND_DN: "<PLEASE_CHANGE>" USERSYNC_SYNC_LDAP_BIND_PASSWORD: "<PLEASE_CHANGE>" USERSYNC_SYNC_LDAP_SEARCH_BASE: "<PLEASE_CHANGE>" USERSYNC_SYNC_LDAP_USER_SEARCH_BASE: "<PLEASE_CHANGE>" USERSYNC_SYNC_LDAP_SSL_ENABLED: "true" USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS: "true"
Run Privacera Manager update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration Properties
Property | Description | Example |
---|---|---|
USERSYNC_SYNC_LDAP_URL | "ldap://dir.ldap.us:389" (when NonSSL) or "ldaps://dir.ldap.us:636" (when SSL) | |
USERSYNC_SYNC_LDAP_BIND_DN | CN=Bind User,OU=example,DC=ad,DC=example,DC=com | |
USERSYNC_SYNC_LDAP_BIND_PASSWORD | ||
USERSYNC_SYNC_LDAP_SEARCH_BASE | OU=example,DC=ad,DC=example,DC=com | |
USERSYNC_SYNC_LDAP_USER_SEARCH_BASE | ||
USERSYNC_SYNC_LDAP_SSL_ENABLED | Set this to true if SSL is enabled on the LDAP server. | true |
USERSYNC_SYNC_LDAP_SSL_PM_GEN_TS | Set this to true if you want Privacera Manager to generate the truststore certificate. Set this to false if you want to manually provide the truststore certificate. To learn how to upload SSL certificates, [click here](../pm-ig/upload_custom_cert.md). | true |
Azure Active Directory (AAD)
Azure Active Directory - Data Access User Synchronization
This topic covers how you can synchronize users, groups, and service principals from your existing Azure Active Directory (AAD) domain.
Pre-requisites
Ensure the following pre-requisites are met:
Create an Azure AD application.
Get the values for the following Azure properties: Application (client) ID, Client secrets
CLI Configuration
SSH to the instance as ${USER}.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.usersync.azuread.yml config/custom-vars/ vi config/custom-vars/vars.usersync.azuread.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
USERSYNC_AZUREAD_TENANT_ID: "<PLEASE_CHANGE>" USERSYNC_AZUREAD_CLIENT_ID: "<PLEASE_CHANGE>" USERSYNC_AZUREAD_CLIENT_SECRET: "<PLEASE_CHANGE>" USERSYNC_AZUREAD_DOMAINS: "<PLEASE_CHANGE>" USERSYNC_AZUREAD_GROUPS: "<PLEASE_CHANGE>" USERSYNC_ENABLE: "true" USERSYNC_SOURCE: "azuread" USERSYNC_AZUREAD_USE_GROUP_LOOKUP_FIRST: "true" USERSYNC_SYNC_AZUREAD_USERNAME_RETRIVAL_FROM: "userPrincipalName" USERSYNC_SYNC_AZUREAD_EMAIL_RETRIVAL_FROM: "userPrincipalName" USERSYNC_SYNC_AZUREAD_GROUP_RETRIVAL_FROM: "displayName" SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_ENABLED: "false" SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_USERNAME_RETRIVAL_FROM: "appId"
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration Properties
Property Name | Description | Example |
---|---|---|
USERSYNC_AZUREAD_TENANT_ID | To get the value for this property, Go to Azure portal > Azure Active Directory > Properties > Tenant ID | 5a5cxxx-xxxx-xxxx-xxxx-c3172b33xxxx |
USERSYNC_AZUREAD_CLIENT_ID | Get the value by following the Pre-requisites section above. | 8a08xxxx-xxxx-xxxx-xxxx-6c0c95a0xxxx |
USERSYNC_AZUREAD_CLIENT_SECRET | Get the value by following the Pre-requisites section above. | ${CLIENT_SECRET} |
USERSYNC_AZUREAD_DOMAINS | To get the value for this property, Go to Azure portal > Azure Active Directory > Domains | componydomain1.com,componydomain2.com |
USERSYNC_AZUREAD_GROUPS | To get the value for this property, Go to Azure portal > Azure Active Directory > Groups | GROUP1,GROUP2",GROUP3 |
USERSYNC_ENABLE | Set to true to enable usersync. | true |
USERSYNC_SOURCE | Source from which users/groups are synced. Values: unix, ldap, azuread | azuread |
USERSYNC_AZUREAD_USE_GROUP_LOOKUP_FIRST | Set to true if you want to first sync all groups and then all the users within those groups. | true |
USERSYNC_SYNC_AZUREAD_USERNAME_RETRIVAL_FROM | Azure provides the user info in a JSON format. Assign a JSON attribute that is unique. This would be the name of the user in Ranger. | userPrincipalName |
USERSYNC_SYNC_AZUREAD_EMAIL_RETRIVAL_FROM | Azure provides the user info in a JSON format. Set the email from the JSON attribute of the Azure user entity. | userPrincipalName |
USERSYNC_SYNC_AZUREAD_GROUP_RETRIVAL_FROM | Azure provides the user info in a JSON format. Use the JSON attribute to retrieve group information for the user. | displayName |
SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_ENABLED | Set to true to sync Azure service principal to the Ranger user entity | false |
SYNC_AZUREAD_USER_SERVICE_PRINCIPAL_USERNAME_RETRIVAL_FROM | Azure provides the service principal info in a JSON format. Assign a JSON attribute that is unique. This would be the name of the user in Ranger. | appId |
Privacera Plugin
Databricks
Privacera Plugin in Databricks
Databricks
Privacera provides two types of plugin solutions for access control in Databricks clusters. Both plugins are mutually exclusive and cannot be enabled on the same cluster.
Databricks Spark Fine-Grained Access Control (FGAC) Plugin
Recommended for SQL, Python, R language notebooks.
Provides FGAC on databases with row filtering and column masking features.
Uses privacera_hive, privacera_s3, privacera_adls, privacera_files services for resource-based access control, and privacera_tag service for tag-based access control.
Uses the plugin implementation from Privacera.
Databricks Spark Object Level Access Control (OLAC) Plugin
OLAC plugin was introduced to provide an alternative solution for Scala language clusters, since using Scala language on Databricks Spark has some security concerns.
Recommended for Scala language notebooks.
Provides OLAC on S3 locations which you are trying to access via Spark.
Uses privacera_s3 service for resource-based access control and privacera_tag service for tag-based access control.
Uses the signed-authorization implementation from Privacera.
Databricks cluster deployment matrix with Privacera plugin
Job/Workflow use-case for automated cluster:
Run-Now will create the new cluster based on the definition mentioned in the job description.
Job Type | Languages | FGAC/DBX version | OLAC/DBX Version |
---|---|---|---|
Notebook | Python/R/SQL | Supported [7.3, 9.1 , 10.4] | |
JAR | Java/Scala | Not supported | Supported[7.3, 9.1 , 10.4] |
spark-submit | Java/Scala/Python | Not supported | Supported[7.3, 9.1 , 10.4] |
Python | Python | Supported [7.3, 9.1 , 10.4] | |
Python wheel | Python | Supported [9.1 , 10.4] | |
Delta Live Tables pipeline | Not supported | Not supported |
Job on existing cluster:
Run-Now will use the existing cluster which is mentioned in the job description.
Job Type | Languages | FGAC/DBX version | OLAC |
---|---|---|---|
Notebook | Python/R/SQL | supported [7.3, 9.1 , 10.4] | Not supported |
JAR | Java/Scala | Not supported | Not supported |
spark-submit | Java/Scala/Python | Not supported | Not supported |
Python | Python | Not supported | Not supported |
Python wheel | Python | supported [9.1 , 10.4] | Not supported |
Delta Live Tables pipeline | Not supported | Not supported |
Interactive use-case
Interactive use-case is running a notebook of SQL/Python on an interactive cluster.
Cluster Type | Languages | FGAC | OLAC |
---|---|---|---|
Standard clusters | Scala/Python/R/SQL | Not supported | Supported [7.3,9.1,10.4] |
High Concurrency clusters | Python/R/SQL | Supported [7.3,9.1,10.4 | Supported [7.3,9.1,10.4] |
Single Node | Scala/Python/R/SQL | Not supported | Supported [7.3,9.1,10.4] |
Databricks Spark Fine-Grained Access Control Plugin [FGAC] [Python, SQL]
Configuration
Run the following commands:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.databricks.plugin.yml config/custom-vars/ vi config/custom-vars/vars.databricks.plugin.yml
Edit the following properties to allow Privacera Platform to connect to your Databricks host. For property details and description, refer to the Configuration Properties below.
DATABRICKS_HOST_URL: "<PLEASE_UPDATE>" DATABRICKS_TOKEN: "<PLEASE_UPDATE>" DATABRICKS_WORKSPACES_LIST: - alias: DEFAULT databricks_host_url: "{{DATABRICKS_HOST_URL}}" token: "{{DATABRICKS_TOKEN}}" DATABRICKS_MANAGE_INIT_SCRIPT: "true" DATABRICKS_ENABLE: "true"
You can also add custom properties that are not included by default. .
Run the following commands:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
(Optional) By default, policies under the default service name, privacera_hive, are enforced. You can customize a different service name and enforce policies defined in the new name. See Configure Service Name for Databricks Spark Plugin.
Configuration properties
Property Name | Description | Example Values |
---|---|---|
| Enter the URL where the Databricks environment is hosted. | For AZURE Databricks, DATABRICKS_HOST_URL: "https://xdx-66506xxxxxxxx.2.azuredatabricks.net/?o=665066931xxxxxxx" For AWS Databricks DATABRICKS_HOST_URL: "https://xxx-7xxxfaxx-xxxx.cloud.databricks.com" |
| Enter the token. To generate the token, 1. Login to your Databricks account. 2. Click the user profile icon in the upper right corner of your Databricks workspace. 3. Click User Settings. 4. Click the Generate New Token button. 5. Optionally enter a description (comment) and expiration period. 6. Click the Generate button. 7. Copy the generated token. | DATABRICKS_TOKEN: "xapid40xxxf65xxxxxxe1470eayyyyycdc06" |
| Add multiple Databricks workspaces to connect to Ranger.
| |
| If set to 'true' Privacera Manager will create the Databricks cluster Init script "ranger_enable.sh" to: '~/privacera/privacera-manager/output/databricks/ranger_enable.sh. | "true" "false" |
| If set to 'true' Privacera Manager will upload Init script ('ranger_enable.sh') to the identified Databricks Host. If set to 'false' upload the following two files to the DBFS location. The files can be located at *~/privacera/privacera-manager/output/databricks*.
| "true" "false" |
| Use the Java agent to assign a string of extra JVM options to pass to the Spark driver. | -javaagent:/databricks/jars/privacera-agent.jar |
| Property to map logged-in user to Ranger user for row-filter policy. It is mapped with the Databricks cluster-level property | current_user() |
| Property to enable masking, row-filter and data_admin access on view. Property to enable masking, row-filter and data_admin access on view. This property is a Privacera Manager (PM) property It is mapped with the Databricks cluster-level property | false |
| Configure Databricks Cluster policy. Add the following JSON in the text area: [{"Note":"First spark conf","key":"spark.hadoop.first.spark.test","value":"test1"},{"Note":"Second spark conf","key":"spark.hadoop.first.spark.test","value":"test2"}] | |
| This property is not part of the default YAML file, but can be added if required. Use this property, if you want to run a specific set of commands in the Databricks init script. | The following example will be added to the cluster init script to allow Athena JDBC via data access server. DATABRICKS_POST_PLUGIN_COMMAND_LIST: - sudo iptables -I OUTPUT 1 -p tcp -m tcp --dport 8181 -j ACCEPT - sudo curl -k -u user:password {{PORTAL_URL}}/api/dataserver/cert?type=dataserver_jks -o /etc/ssl/certs/dataserver.jks - sudo chmod 755 /etc/ssl/certs/dataserver.jks |
| This property allows you to backlist APIs to enable security. This property is a Privacera Manager (PM) property It is mapped with the Databricks cluster-level property | The following example will be added to the cluster init script to allow Athena JDBC via data access server. DATABRICKS_POST_PLUGIN_COMMAND_LIST: - sudo iptables -I OUTPUT 1 -p tcp -m tcp --dport 8181 -j ACCEPT - sudo curl -k -u user:password {{PORTAL_URL}}/api/dataserver/cert?type=dataserver_jks -o /etc/ssl/certs/dataserver.jks - sudo chmod 755 /etc/ssl/certs/dataserver.jks |
Managing init script
Automatic upload
If DATABRICKS_ENABLE
is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT
is 'true', then the Init script will be uploaded automatically to your Databricks host. The init script will be uploaded to dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable.sh
where <DEPLOYMENT_ENV_NAME>
is the value of DEPLOYMENT_ENV_NAME
mentioned in vars.privacera.yml
.
Manual upload
If DATABRICKS_ENABLE
is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT
is 'false', then the Init script must be uploaded to your Databricks host.
To avoid the manual steps below, you should set DATABRICKS_MANAGE_INIT_SCRIPT=true
and follow the instructions outlined in Automatic Upload.
Open a terminal and connect to Databricks account using your Databricks login credentials/token.
Connect using login credentials:
If you're using login credentials, then run the following command:
databricks configure --profile privacera
Enter the Databricks URL:
Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
Enter the username and password: Username: email-id@example.com Password:
Connect using Databricks token:
If you don't have a Databricks token, you can generate one. For more information, refer Generate a personal access token.
If you're using token, then run the following command:
databricks configure --token --profile privacera
Enter the Databricks URL:
Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
Enter the token:
Token:
To check if the connection to your Databricks account is established, run the following command:
dbfs ls dbfs:/ --profile privacera
You should see the list of files in the output, if you are connected to your account.
Upload files manually to Databricks:
Copy the following files to DBFS, which are available in the PM host at the location,
~/privacera/privacera-manager/output/databricks
:ranger_enable.sh
privacera_spark_plugin.conf
privacera_spark_plugin_job.conf
privacera_custom_conf.zip
Run the following command. For the value of
<DEPLOYMENT_ENV_NAME>
, you can get it from the file,~/privacera/privacera-manager/config/vars.privacera.yml
.export DEPLOYMENT_ENV_NAME=<DEPLOYMENT_ENV_NAME> dbfs mkdirs dbfs:/privacera/${DEPLOYMENT_ENV_NAME} --profile privacera dbfs cp ranger_enable.sh dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera dbfs cp privacera_spark_plugin.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera dbfs cp privacera_spark_plugin_job.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera dbfs cp privacera_custom_conf.zip dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
Verify the files have been uploaded.
dbfs ls dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
The Init Script will be uploaded to
dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable.sh
, where<DEPLOYMENT_ENV_NAME>
is the value ofDEPLOYMENT_ENV_NAME
mentioned invars.privacera.yml
.
Configure Databricks Cluster
Once the update completes successfully, log on to the Databricks console with your account and open the target cluster, or create a new target cluster.
Open the Cluster dialog and enter Edit mode.
In the Configuration tab, select Advanced Options > Spark.
Add the following content to the Spark Config edit box. For more information on the Spark config properties, click here.
New Properties
Note
From Privacera 5.0.6.1 Release onwards, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.
For Databricks versions < 7.3, Old Properties should only be used since the versions are in extended support.
spark.databricks.cluster.profile serverless spark.databricks.isv.product privacera spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar spark.databricks.repl.allowedLanguages sql,python,r
Old Properties
spark.databricks.cluster.profile serverless spark.databricks.repl.allowedLanguages sql,python,r spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar spark.databricks.isv.product privacera spark.databricks.pyspark.enableProcessIsolation true
In the Configuration tab, in Edit mode, Open Advanced Options (at the bottom of the dialog) and then set init script path. For the
<DEPLOYMENT_ENV_NAME>
variable, enter the deployment name as defined for theDEPLOYMENT_ENV_NAME
variable in thevars.privacera.yml
.dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable.sh
In the Table Access Control section, uncheck Enable table access control and only allow Python and SQL commands and Enable credential passthrough for user-level data access and only allow Python and SQL commands checkboxes.
Save (Confirm) this configuration.
Start (or Restart) the selected Databricks Cluster.
To enable view-level access control (via Data-Admin), and view-level row-level filtering and column masking, add the property
DATABRICKS_SPARK_PRIVACERA_VIEW_LEVEL_MASKING_ROWFILTER_EXTENSION_ENABLE: "true"
in custom-vars. Search for this property in Spark Properties for more information. To learn how to use the property, see Apply View-level Access Control.By default, certain python packages are blocked on the Databricks cluster for security compliance. If you still wish to use these packages, see Whitelist py4j security manager via S3 or DBFS.
If you want to enable JWT-based user authentication for your Databricks clusters, see JWT for Databricks.
If you want PM to add cluster policies in Databricks, see Configure Databricks Cluster Policy.
If you want to add additional Spark properties for your Databricks cluster, see Spark Properties for Databricks Cluster.
Validation
In order to help evaluate the use of Privacera with Databricks, Privacera provides a set of Privacera Manager 'demo' notebooks. These can be downloaded from Privacera S3 repository using either your favorite browser, or a command line 'wget'. Use the notebook/sql sequence that matches your cluster.
Download using your browser (just click on the correct file for your cluster, below:
https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPlugin.sql
If AWS S3 is configured from your Databricks cluster: https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginS3.sql
If ADLS Gen2 is configured from your Databricks cluster: https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginADLS.sql
or, if you are working from a Linux command line, use the 'wget' command to download.
wget https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPlugin.sql -O PrivaceraSparkPlugin.sql
wget https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginS3.sql -O PrivaceraSparkPluginS3.sql
wget https://privacera.s3.amazonaws.com/public/pm-demo-data/databricks/PrivaceraSparkPluginADLS.sql -O PrivaceraSparkPluginADLS.sql
Import the Databricks notebook:
Log in to the Databricks Console
Select Workspace > Users > Your User.
From the drop down menu, select Import and choose the file downloaded.
Follow the suggested steps in the text of the notebook to exercise and validate Privacera with Databricks.
Databricks Spark Object-level Access Control Plugin [OLAC] [Scala]
Prerequisites
Ensure the following prerequisites are met:
Dataserver should be installed and confirmed working:
For AWS, configure AWS S3 Dataserver
For Azure, configure Azure Dataserver
Configure Databricks Spark Plugin.
Configuration
Run the following commands.
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.databricks.scala.yml config/custom-vars/ vi config/custom-vars/vars.databricks.scala.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
DATASERVER_DATABRICKS_ALLOWED_URLS : "<PLEASE_UPDATE>" DATASERVER_AWS_STS_ROLE: "<PLEASE_CHANGE>"
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
| Set the property to enable/disable Databricks Scala. This is found under Databricks Signed URL Configuration For Scala Clusters section. | |
| Add a URL or comma-separated URLs. Privacera Dataserver serves only those URLs mentioned in this property. | https://xxx-7xxxfaxx-xxxx.cloud.databricks.com |
| Add the instance profile ARN of the AWS role, which can access Delta Files in Databricks. | arn:aws:iam::111111111111:role/assume-role |
| Configure Databricks Cluster policy. Add the following JSON in the text area: [{"Note":"First spark conf", "key":"spark.hadoop.first.spark.test", "value":"test1"}, {"Note":"Second spark conf", "key":"spark.hadoop.first.spark.test", "value":"test2"}] |
Managing init script
Automatic Upload
If DATABRICKS_ENABLE is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT is "true", the Init script will be uploaded automatically to your Databricks host. The Init Script will be uploaded to dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable_scala.sh
, where <DEPLOYMENT_ENV_NAME>
is the value of DEPLOYMENT_ENV_NAME
mentioned in vars.privacera.yml
.
Manual Upload
If DATABRICKS_ENABLE is 'true' and DATABRICKS_MANAGE_INIT_SCRIPT is "false" the Init script must be uploaded to your Databricks host.
Open a terminal and connect to Databricks account using your Databricks login credentials/token.
Connect using login credentials:
If you're using login credentials, then run the following command.
databricks configure --profile privacera
Enter the Databricks URL.
Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
Enter the username and password.
Username: email-id@yourdomain.com Password:
Connect using Databricks token:
If you don't have a Databricks token, you can generate one. For more information, refer Generate a personal access token.
If you're using token, then run the following command.
databricks configure --token --profile privacera
Enter the Databricks URL.
Databricks Host (should begin with https://): https://dbc-xxxxxxxx-xxxx.cloud.databricks.com/
Enter the token.
Token:
To check if the connection to your Databricks account is established, run the following command.
dbfs ls dbfs:/ --profile privacera
You should see the list of files in the output, if you are connected to your account.
Upload files manually to Databricks.
Copy the following files to DBFS, which are available in the PM host at the location,
~/privacera/privacera-manager/output/databricks
:ranger_enable_scala.sh
privacera_spark_scala_plugin.conf
privacera_spark_scala_plugin_job.conf
Run the following command. For the value of
<DEPLOYMENT_ENV_NAME>
, you can get it from the file,~/privacera/privacera-manager/config/vars.privacera.yml
.export DEPLOYMENT_ENV_NAME=<DEPLOYMENT_ENV_NAME> dbfs mkdirs dbfs:/privacera/${DEPLOYMENT_ENV_NAME} --profile privacera dbfs cp ranger_enable_scala.sh dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera dbfs cp privacera_spark_scala_plugin.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera dbfs cp privacera_spark_scala_plugin_job.conf dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
Verify the files have been uploaded.
dbfs ls dbfs:/privacera/${DEPLOYMENT_ENV_NAME}/ --profile privacera
The Init Script is uploaded to
dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable_scala.sh
, where<DEPLOYMENT_ENV_NAME>
is the value ofDEPLOYMENT_ENV_NAME
mentioned invars.privacera.yml
.
Configure Databricks cluster
Once the update completes successfully, log on to the Databricks console with your account and open the target cluster, or create a new target cluster.
Open the Cluster dialog. enter Edit mode.
In the Configuration tab, in Edit mode, Open Advanced Options (at the bottom of the dialog) and then the Spark tab.
Add the following content to the Spark Config edit box. For more information on the Spark config properties, click here.
New Properties
spark.databricks.isv.product privacera spark.driver.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar spark.executor.extraJavaOptions -javaagent:/databricks/jars/privacera-agent.jar spark.databricks.repl.allowedLanguages sql,python,r,scala spark.databricks.delta.formatCheck.enabled false
Old Properties
spark.databricks.cluster.profile serverless spark.databricks.delta.formatCheck.enabled false spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar spark.executor.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar spark.databricks.isv.product privaceraspark.databricks.repl.allowedLanguages sql,python,r,scala
Note
From Privacera 5.0.6.1 Release onwards, it is recommended to replace the Old Properties with the New Properties. However, the Old Properties will also continue to work.
For Databricks versions < 7.3, Old Properties should only be used since the versions are in extended support.
(Optional) To use regional endpoint for S3 access, add the following content to the Spark Config edit box.
spark.hadoop.fs.s3a.endpoint https://s3.<region>.amazonaws.com spark.hadoop.fs.s3.endpoint https://s3.<region>.amazonaws.com spark.hadoop.fs.s3n.endpoint https://s3.<region>.amazonaws.com
In the Configuration tab, in Edit mode, Open Advanced Options (at the bottom of the dialog) and then set init script path. For the
<DEPLOYMENT_ENV_NAME>
variable, enter the deployment name as defined for theDEPLOYMENT_ENV_NAME
variable in thevars.privacera.yml
.dbfs:/privacera/<DEPLOYMENT_ENV_NAME>/ranger_enable_scala.sh
Save (Confirm) this configuration.
Start (or Restart) the selected Databricks Cluster.
Related information
For further reading, see:
If you want to enable JWT-based user authentication for your Databricks clusters, see JWT for Databricks.
If you want PM to add cluster policies in Databricks, see Configure Databricks Cluster Policy.
If you want to add additional Spark properties for your Databricks cluster, see Spark Properties for Databricks Cluster.
Spark standalone
Privacera plugin in Spark standalone
This section covers how you can use Privacera Manager to generate the setup script and Spark custom configuration for SSL/TSL to install Privacera Plugin in an open-source Spark environment.
The steps outlined below are only applicable to the Spark 3.x version.
Prerequisites
Ensure the following prerequisites are met:
A working Spark environment.
Privacera services must be up and running.
Configuration
SSH to the instance as USER.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.spark-standalone.yml config/custom-vars/ vi config/custom-vars/vars.spark-standalone.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
SPARK_STANDALONE_ENABLE:"true" SPARK_ENV_TYPE:"<PLEASE_CHANGE>" SPARK_HOME:"<PLEASE_CHANGE>" SPARK_USER_HOME:"<PLEASE_CHANGE>"
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is complete, the setup script (
privacera_setup.sh
,standalone_spark_FGAC.sh
,standalone_spark_OLAC.sh
) and Spark custom configurations (spark_custom_conf.zip
) for SSL will be generated at the path,cd ~/privacera/privacera-manager/output/spark-standalone
.You can either enable FGAC or OLAC in your Spark environment.
Enable FGAC
To enable Fine-grained access control (FGAC), do the following:
Copy
standalone_spark_FGAC.sh
andspark_custom_conf.zip
. Both the files should be placed under the same folder.Add permissions to execute the script.
chmod +x standalone_spark_FGAC.sh
Run the script to install the Privacera plugin in your Spark environment.
./standalone_spark_FGAC.sh
Enable OLAC
To enable Object level access control (OLAC), do the following:
Copy
standalone_spark_OLAC.sh
andspark_custom_conf.zip
. Both the files should be placed under the same folder.Add permissions to execute the script.
chmod +x standalone_spark_OLAC.sh
Run the script to install the Privacera plugin in your Spark environment.
./standalone_spark_OLAC.sh
Configuration properties
Property | Description | Example |
---|---|---|
| Property to enable generating setup script and configs for Spark standalone plugin installation. | true |
| Set the environment type. It can be any user-defined type. For example, if you're working in an environment that runs locally, you can set the type as local; for a production environment, set it as prod. | local |
| Home path of your Spark installation. | ~/privacera/spark/spark-3.1.1-bin-hadoop3.2 |
| User home directory of your Spark installation. | /home/ec2-user |
| Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user. To enable the fallback, set to true; to disable, set to false. | true |
Validations
To verify the successful installation of Privacera plugin, do the following:
Create an S3 bucket ${S3_BUCKET} for sample testing.
Download sample data using the following link and put it in the ${S3_BUCKET} at location (s3://${S3_BUCKET}/customer_data).
wget https://privacera-demo.s3.amazonaws.com/data/uploads/customer_data_clear/customer_data_without_header.csv
(Optional) Add AWS JARS in Spark. Download the JARS according to the version of Spark Hadoop in your environment.
cd <SPARK_HOME>/jars
For Spark-3.1.1 - Hadoop 3.2 version,
wget https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.2.0/hadoop-aws-3.2.0.jar wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.375/aws-java-sdk-bundle-1.11.375.jar
Run the following command.
cd <SPARK_HOME>/bin
Run the spark-shell to execute scala commands.
./spark-shell
Validations with JWT Token
Run the following command.
cd <SPARK_HOME>/bin
Set the JWT_TOKEN.
JWT_TOKEN="<JWT_TOKEN>"
Run the following command to start spark-shell with parameters.
./spark-shell --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}" --conf "spark.hadoop.privacera.jwt.oauth.enable=true"
Validations with JWT token and public key
Create a local file with the public key, if the JWT token is generated by private/public key combination.
Set the following according to the payload of JWT Token.
JWT_TOKEN="<JWT_TOKEN>" #The following variables are optional, set it only if token has it else set it empty JWT_TOKEN_ISSUER="<JWT_TOKEN_ISSUER>" JWT_TOKEN_PUBLIC_KEY_FILE="<JWT_TOKEN_PUBLIC_KEY_FILE_PATH>" JWT_TOKEN_USER_KEY="<JWT_TOKEN_USER_KEY>" JWT_TOKEN_GROUP_KEY="<JWT_TOKEN_GROUP_KEY>" JWT_TOKEN_PARSER_TYPE="<JWT_TOKEN_PARSER_TYPE>"
Run the following command to start spark-shell with parameters.
./spark-shell --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}" --conf "spark.hadoop.privacera.jwt.oauth.enable=true" --conf "spark.hadoop.privacera.jwt.token.publickey=${JWT_TOKEN_PUBLIC_KEY_FILE}" --conf "spark.hadoop.privacera.jwt.token.issuer=${JWT_TOKEN_ISSUER}" --conf "spark.hadoop.privacera.jwt.token.parser.type=${JWT_TOKEN_PARSER_TYPE}" --conf "spark.hadoop.privacera.jwt.token.userKey=${JWT_TOKEN_USER_KEY}" --conf "spark.hadoop.privacera.jwt.token.groupKey=${JWT_TOKEN_GROUP_KEY}"
Use cases
Add a policy in Access Manager with read permission to ${S3_BUCKET}.
val file_path = "s3a://${S3_BUCKET}/customer_data/customer_data_without_header.csv" val df=spark.read.csv(file_path) df.show(5)
Add a policy in Access Manager with delete and write permission to ${S3_BUCKET}.
df.write.format("csv").mode("overwrite").save("s3a://${S3_BUCKET}/csv/customer_data.csv")
Spark on EKS
Privacera plugin in Spark on EKS
This section covers how you can use Privacera Manager to generate the setup script and Spark custom configuration for SSL to install the Privacera plugin in Spark on an EKS cluster.
Prerequisites
Ensure the following prerequisites are met:
Running Spark on an EKS cluster.
Privacera services must be up and running.
Configuration
SSH to the instance as USER.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.spark-standalone.yml config/custom-vars/ vi config/custom-vars/vars.spark-standalone.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
SPARK_STANDALONE_ENABLE:"true" SPARK_ENV_TYPE:"<PLEASE_CHANGE>" SPARK_HOME:"<PLEASE_CHANGE>" SPARK_USER_HOME:"<PLEASE_CHANGE>"
Run the following commands:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is complete, the Spark custom configuration (
spark_custom_conf.zip
) for SSL will be generated at the path,cd ~/privacera/privacera-manager/output/spark-standalone
.Create the Spark Docker Image
Run the following commands to export
PRIVACERA_BASE_DOWNLOAD_URL
:exportPRIVACERA_BASE_DOWNLOAD_URL=<PRIVACERA_BASE_DOWNLOAD_URL>
Create a folder.
mkdir -p ~/privacera-spark-plugin cd ~/privacera-spark-plugin
Download and extract package using wget.
wget ${PRIVACERA_BASE_DOWNLOAD_URL}/spark-plugin/k8s-spark-pkg.tar.gz -O k8s-spark-pkg.tar.gz tar xzf k8s-spark-pkg.tar.gz rm -r k8s-spark-pkg.tar.gz
Copy
spark_custom_conf.zip
file from the Privacera Manager output folder into the files folder.cp ~/privacera/privacera-manager/output/spark-standalone/spark_custom_conf.zip files/spark_custom_conf.zip
You can either built OLAC Docker image or FGAC Docker image.
OLAC
To built the OLAC Docker image, use the following command:
./build_image.sh ${PRIVACERA_BASE_DOWNLOAD_URL} OLAC
FGAC
To built the FGAC Docker image, use the following command:
./build_image.sh ${PRIVACERA_BASE_DOWNLOAD_URL} FGAC
Test the Spark Docker image.
Create a S3 bucket ${S3_BUCKET} for sample testing.
Download sample data using the following link and put it in the ${S3_BUCKET} at location (s3://${S3_BUCKET}/customer_data).
wget https://privacera-demo.s3.amazonaws.com/data/uploads/customer_data_clear/customer_data_without_header.csv
Start Docker in an interactive mode.
IMAGE=privacera-spark-plugin:latest docker run --rm -i -t ${IMAGE} bash
Start spark-shell inside the Docker container.
JWT_TOKEN="<PLEASE_CHANGE>" cd /opt/privacera/spark/bin ./spark-shell \ --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"\ --conf "spark.hadoop.privacera.jwt.oauth.enable=true"
Run the following command to read the S3 file:
val df= spark.read.csv("s3a://${S3_BUCKET}/customer_data/customer_data_without_header.csv")
Exit the Docker shell.
exit
Publish the Spark Docker Image into your Docker Registry.
For
HUB
,HUB_USERNAME
, andHUB_PASSWORD
, use the Docker hub URL and login credentials.For
ENV_TAG
, its value can be user-defined depending on your deployment environment such as development, production or test. For example,ENV_TAG=dev
can be used for a development environment.
HUB=<PLEASE_CHANGE> HUB_USERNAME=<PLEASE_CHANGE> HUB_PASSWORD=<PLEASE_CHANGE> ENV_TAG=<PLEASE_CHANGE> DEST_IMAGE=${HUB}/privacera-spark-plugin:${ENV_TAG} SOURCE_IMAGE=privacera-spark-plugin:latest docker login -u ${HUB_USERNAME} -p ${HUB_PASSWORD}${HUB} docker tag ${SOURCE_IMAGE}${DEST_IMAGE} docker push ${DEST_IMAGE}
Deploy Spark Plugin on EKS cluster.
SSH to EKS cluster where you want to deploy Spark on EKS cluster.
Run the following commands to export
PRIVACERA_BASE_DOWNLOAD_URL
:exportPRIVACERA_BASE_DOWNLOAD_URL=<PRIVACERA_BASE_DOWNLOAD_URL>
Create a folder.
mkdir ~/privacera-spark-plugin cd ~/privacera-spark-plugin
Download and extract package using wget.
wget ${PRIVACERA_DOWNLOAD_URL}/plugin/spark/k8s-spark-deploy.tar.gz -O k8s-spark-deploy.tar.gz tar xzf k8s-spark-deploy.tar.gz rm -r k8s-spark-deploy.tar.gz cd k8s-spark-deploy/
Open
penv.sh
file and substitute the values of the following properties, refer to the table below:Property
Description
Example
SPARK_NAME_SPACE
Kubernetes namespace
privacera-spark-plugin-test
SPARK_PLUGIN_ROLE_BINDING
Spark role Binding
privacera-sa-spark-plugin-role-binding
SPARK_PLUGIN_SERVICE_ACCOUNT
Spark services account
privacera-sa-spark-plugin
SPARK_PLUGN_ROLE
Spark services account role
privacera-sa-spark-plugin-role
SPARK_PLUGIN_APP_NAME
Spark services account role
privacera-sa-spark-plugin-role
SPARK_PLUGIN_IMAGE
Docker image with hub
myhub.docker.com}/privacera-spark-plugin:prod-olac
SPARK_DOCKER_PULL_SECRET
Secret for docker-registry
spark-plugin-docker-hub
Run the following command to replace the properties value in the Kubernetes deployment
.yml
file:mkdir -p backup cp *.yml backup/ ./replace.sh
Run the following command to create Kubernetes resources:
kubectl apply -f namespace.yml kubectl apply -f service-account.yml kubectl apply -f role.yml kubectl apply -f role-binding.yml
Run the following command to create secret for
docker-registry
:kubectl create secret docker-registry spark-plugin-docker-hub --docker-server=<PLEASE_CHANGE> --docker-username=<PLEASE_CHANGE> --docker-password='<PLEASE_CHANGE>' --namespace=<PLEASE_CHANGE>
Run the following command to deploy a sample Spark application:
Note
This is an sample file used for deployment. As per your use case, you can create Spark deployment file and deploy a Docker image.
kubectl apply -f privacera-spark-examples.yml -n ${SPARK_NAME_SPACE}
This will deploy spark application in Kubernetes pod with Privacera plugin and it will keep the pod running, so that you can use it in interactive mode.
Configuration properties
Property | Description | Example |
---|---|---|
| Property to enable generating setup script and configs for Spark standalone plugin installation. | true |
| Set the environment type. It can be any user-defined type. For example, if you're working in an environment that runs locally, you can set the type as local; for a production environment, set it as prod. | local |
| Home path of your Spark installation. | ~/privacera/spark/spark-3.1.1-bin-hadoop3.2 |
| User home directory of your Spark installation. | /home/ec2-user |
| Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user. To enable the fallback, set to true; to disable, set to false. | true |
Validation
Get all the resources.
kubectl get all -n ${SPARK_NAME_SPACE}
Copy POD ID that you will need for spark-master connection.
Get the cluster info.
kubectl cluster-info
Copy Kubernetes control plane URL from the above output that we need during spark-shell command, for example ( https://xxxxxxxxxxxxxxxxxxxxxxx.yl4.us-east-1.eks.amazonaws.com).
When using the URL for
EKS_SERVER
property in step 4, prefix the property value withk8s://
. The following is an example of the property:EKS_SERVER="k8s://https://xxxxxxxxxxxxxxxxxxxxxxx.yl4.us-east-1.eks.amazonaws.com"
Connect to Kubernetes master node.
kubectl -n ${SPARK_NAME_SPACE}exec -it <POD_ID> -- bash
Set the following properties:
SPARK_NAME_SPACE="<PLEASE_CHANGE>" SPARK_PLUGIN_SERVICE_ACCOUNT="<PLEASE_CHANGE>" SPARK_PLUGIN_IMAGE="<PLEASE_CHANGE>" SPARK_DOCKER_PULL_SECRET="spark-plugin-docker-hub" EKS_SERVER="<PLEASE_CHANGE>" JWT_TOKEN="<PLEASE_CHANGE>"
Run the following commands to open spark-shell. The command contains all the setup which is required to open the spark-shell.
cd /opt/privacera/spark/bin ./spark-shell --master ${EKS_SERVER}\ --deploy-mode client \ --conf spark.kubernetes.authenticate.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\ --conf spark.kubernetes.namespace=${SPARK_NAME_SPACE}\ --conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \ --conf spark.kubernetes.authenticate.submission.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\ --conf spark.kubernetes.container.image=${SPARK_PLUGIN_IMAGE}\ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image.pullSecrets=${SPARK_DOCKER_PULL_SECRET}\ --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"\ --conf "spark.hadoop.privacera.jwt.oauth.enable=true"\ --conf spark.driver.bindAddress='0.0.0.0'\ --conf spark.driver.host=$SPARK_PLUGIN_POD_IP\ --conf spark.port.maxRetries=4\ --conf spark.kubernetes.driver.pod.name=$SPARK_PLUGIN_POD_NAME
Run the following command using
spark-submit
with JWT authentication../spark-submit \ --master ${EKS_SERVER}\ --name spark-cloud-new \ --deploy-mode cluster \ --conf spark.kubernetes.authenticate.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\ --conf spark.kubernetes.namespace=${SPARK_NAME_SPACE}\ --conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \ --conf spark.kubernetes.authenticate.submission.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=${SPARK_PLUGIN_SERVICE_ACCOUNT}\ --conf spark.kubernetes.container.image=${SPARK_PLUGIN_IMAGE}\ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image.pullSecrets=${SPARK_DOCKER_PULL_SECRET}\ --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}"\ --conf spark.driver.bindAddress='0.0.0.0'\ --conf spark.driver.host=$SPARK_PLUGIN_POD_IP\ --conf spark.port.maxRetries=4\ --conf spark.kubernetes.driver.pod.name=$SPARK_PLUGIN_POD_NAME\ --class com.privacera.spark.poc.SparkSample \ <your-code-jar/file>
To check the read access on the S3 file, run the following command in the open spark-shell:
val df= spark.read.csv("s3a://${S3_BUCKET}/customer_data/customer_data_without_header.csv") df.show()
To check the write access on the S3 file, run the following command in the open spark-shell:
df.write.format("csv").mode("overwrite").save("s3a://${S3_BUCKET}/output/k8s/sample/csv")
Check the Audit logs on the Privacera Portal.
To verify the spark-shell setup, open another SSH connection for Kubernetes cluster and run the following command to check the running pods:
kubectl get pods -n ${SPARK_NAME_SPACE}
You will see the spark executor pods
-exec-x
. For example,spark-shell-xxxxxxxxxxxxxxxx-exec-1
andspark-shell-xxxxxxxxxxxxxxxx-exec-2
.
Portal SSO with PingFederate
Privacera portal leverages PingIdentity’s Platform Portal for authentication via SAML. For this integration, there are configuration steps in both Privacera portal and PingIdentity.
Configuration steps for PingIdentity
Sign in to your PingIdentity account.
Under Your Environments , click Administrators.
Select Connections from the left menu.
In the Applications section, click on the + button to add a new application.
Enter an Application Name (such as Privacera Portal SAML) and provide a description (optionally add an icon). For the Application Type, select SAML Application. Then click Configure.
On the SAML Configuration page, under "Provide Application Metadata", select Manually Enter.
Enter the ACS URLs:
https://<portal_hostname>:<PORT>/saml/SSO
Enter the Entity ID:
privacera-portal
Click the Save button.
On the Overview page for the new application, click on the Attributes edit button. Add the attribute mapping:
user.login: Username
Set as Required.
Note
If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.
You can optionally add additional attribute mappings:
user.email: Email Address user.firstName: Given Name user.lastName: Family Name
Click the Save button.
Next in your application, select Configuration and then the edit icon.
Set the SLO Endpoint:
https://<portal_hostname>:<PORT>/login.html
Click the Save button.
In the Configuration section, under Connection Details, click on Download Metadata button.
Once this file is downloaded, rename it to:
privacera-portal-aad-saml.xml
This file will be used in the Privacera Portal configuration.
Configuration steps in Privacera Portal
Now we will configure Privacera Portal using privacera-manager to use the privacera-portal-aad-saml.xml file created in the above steps.
Run the following commands:
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
Edit the vars.portal.saml.aad.yml file:
vi config/custom-vars/vars.portal.saml.aad.yml
Add the following properties:
SAML_ENTITY_ID: "privacera-portal" SAML_BASE_URL: "https://{{app_hostname}}:{port}" PORTAL_UI_SSO_ENABLE: "true" PORTAL_UI_SSO_URL: "saml/login" PORTAL_UI_SSO_BUTTON_LABEL: "Single Sign On" AAD_SSO_ENABLE: "true"
Copy the privacera-portal-aad-saml.xml file to the following folder:
~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
Edit the vars.portal.yml file:
cd ~/privacera/privacera-manager/ vi config/custom-vars/vars.portal.yml
Add the following properties and assign your values.
SAML_EMAIL_ATTRIBUTE: "user.email" SAML_USERNAME_ATTRIBUTE: "user.login" SAML_LASTNAME_ATTRIBUTE: "user.lastName" SAML_FIRSTNAME_ATTRIBUTE: "user.firstName"
Run the following to update
privacera-manager
:cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
You should now be able to use Single Sign-on to Privacera using PingFederate.
Trino Open Source
Privacera Plugin in Trino Open Source
Learn how you can use Privacera Manager to generate the setup script and Trino custom configuration for SSL to install Privacera Plugin in an open-source Trino environment.
Privacera Trino supports Trino Open Source with the following catalogs:
Hive
PostgreSQL DB
Redshift
Prerequisites
A working Trino environment
Privacera services must be up and running.
Configuration
SSH to the instance as
USER
.Run the following commands:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.trino.opensource.yml config/custom-vars/ vi config/custom-vars/vars.trino.opensource.yml
Edit the following properties. For property details and descriptions, see Table 4, “Trino Open Source Properties”.
TRINO_STANDALONE_ENABLE: "true" TRINO_USER_HOME: "<PLEASE_CHANGE>" TRINO_INSTALL_DIR_NAME: "<PLEASE_CHANGE>"
Run the following commands:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is complete, the setup script (privacera_trino_setup.sh) and Trino custom configurations (privacera_trino_plugin_conf.zip) for SSL will be generated at the path,
cd ~/privacera/privacera-manager/output/trino-opensource/
.In your Trino environment, do the following:
Copy privacera_trino_setup.sh and privacera_trino_plugin_conf.zip. Both the files should be placed under the same folder.
Add permissions to execute the script.
chmod +x privacera_trino_setup.sh
Run the script to install the Privacera plugin in your Trino environment.
./privacera_trino_setup.sh
Note
To learn more about Trino, see Trino User Guide.
Table Properties for Trino Open Source
Property | Description | Example |
---|---|---|
TRINO_OPENSOURCE_ENABLETRINO_OPENSOURCE_ENABLE | Property to enable/disable Trino. | true |
TRINO_USER_HOME | Property to set the path to the Trino home directory. | /home/ec2-user |
TRINO_INSTALL_DIR_NAME | Property to set the path to the directoy where Trino is installed. | /etc/trino |
TRINO_RANGER_SERVICE_REPO | Property to indicate Trino Ranger policy. | privacera_trino |
TRINO_AUDITS_URL_EXTERNAL | Solr audit URL or audit server URL. |
|
TRINO_RANGER_EXTERNAL_URL | This is a Ranger Admin URL. | /etc/trino |
XAAUDIT.SOLR.ENABLE | Enable/Disable solr audit. Set the value to | true |
TRINO_HIVE_POLICY_AUTHZ_ENABLED | Enable/Disable Hive policy authorization for the Hive catalog.Set the value to | true |
TRINO_HIVE_POLICY_REPO_CATALOG_MAPPING | Indicates Hive policy repository and Hive catalog mapping. Use the following format: {hive_policy_repo-1}:{comma_separated_hive_catalogs};{hive_policy_repo-2}:{comma_separated_hive_catalogs} | privacera_hive:hiveprivacera_hive:hivecatalog1, |
TRINO_RANGER_AUTH_ENABLED | Set the value to | true |
Migrating from PrestoSQL to Trino
To migrate your existing policies from PrestoSQL to Trino, see Migrating Steps.
Dremio
Introduction
This section covers how you can integrate Dremio with Privacera. You can use Dremio for table-level access control with the native Ranger plugin.
By integrating Dremio with Privacera, you'll be provided with comprehensive data lake security and fine-grained access control across multi-cloud environments. Dremio works directly with data lake storage. Using Dremio's query engine and ability to democratize data access, Privacera implements fine-grained access control policies, then automatically enforces and audits them at enterprise scale.
Dremio is supported with the following data sources:
S3
ADLS
Hive
Redshift
Prerequisites
Ensure the following prerequisites are met:
A Privacera Manager host where Privacera services are running.
A Dremio host where Dremio Enterprise Edition is installed. (The Community Edition is not supported.)
Configuration
To configure Dremio:
Note
There are limitations in the Dremio native Hive plugin because Dremio uses Ranger 1.1.0.
Audit Server basic auth needs to be disabled because it's not supported.
Dremio does not support solr audits in SSL if it is enabled in the audit server.
Run the following commands:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.dremio.yml config/custom-vars/
Update the following properties:
AUDITSERVER_ENABLE: "true" AUDITSERVER_AUTH_TYPE: "none" AUDITSERVER_SSL_ENABLE: "false"
Run the following commands to configure the audit server for Dremio Native Hive Ranger Based authorization..
cd ~/privacera/privacera-manager cp config/sample-vars/vars.auditserver.yml config/custom-vars/ vi config/custom-vars/vars.auditserver.yml
After the update is completed, the Dremio plugin installation script
privacera_dremio.sh
and custom configuration archiveprivacera_custom_conf.tar.gz
is generated at the location ~/privacera/privacera-manager/output/dremioConfigure Privacera plugin depending on how you have installed Dremio in your instance.
For a new or existing data source configured in Dremio Data Lake, ensure Enable external authorization plugin checkbox under Settings > Advanced Options of the data source is selected in the Dremio UI.
Restart the Dremio service.
Kubernetes
Depending on your cloud provider, you can set up Dremio in a Kubernetes container. For more information, see the following links.
After setting up Dremio, perform the following steps to deploy Privacera plugin. The steps assume that your Privacera Manager host instance is separate from your Dremio Kubernetes instance. If they are configured on the single instance, then modify the steps accordingly.
SSH to your instance where Dremio is installed containing the Dremio Kubernetes artifacts and change to the dremio-cloud-tools/charts/dremio_v2/ directory.
Copy the
privacera_dremio.sh
andprivacera_custom_conf.tar.gz
files from your Privacera Manager host instance to the dremio_v2 folder in your Dremio Kubernetes instance.Run the following commands:
mkdir -p privacera_config mv privacera_dremio.sh privacera_config/ mv privacera_custom_conf.tar.gz privacera_config/
Update
configmap.yml
to add new configmap for Privacera configuration.vi templates/dremio-configmap.yaml
Add the following configuration at the start of the file:
apiVersion: v1 kind: ConfigMap metadata: name: dremio-privacera-install data: privacera_dremio.sh: |- {{ .Files.Get "privacera_config/privacera_dremio.sh" | nindent 4 }} binaryData: privacera_custom_conf.tar.gz: {{ .Files.Get "privacera_config/privacera_custom_conf.tar.gz" | b64enc | nindent 4 }} ---
Update
dremio-env
to add Privacera jars and configuration in the Dremio classpath.vi config/dremio-env
Add the following variable, or update it if it already exists:
DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*
Update
values.yaml
.vi values.yaml
Add the following configuration for extraInitContainers inside the coordinator section:
extraInitContainers: | - name: install-privacera-dremio-plugin image: {{.Values.image}}:{{.Values.imageTag}} imagePullPolicy: IfNotPresent securityContext: runAsUser: 0 volumeMounts: - name: dremio-privacera-plugin-volume mountPath: /opt/dremio/plugins/authorizer - name: dremio-ext-jars-volume mountPath: /opt/privacera/dremio-ext-jars - name: dremio-privacera-config mountPath: /opt/privacera/conf/ - name: dremio-privacera-install mountPath: /opt/privacera/install/ command: - "bash" - "-c" - "cd /opt/privacera/install/ && cp * /tmp/ && cd /tmp && ./privacera_dremio.sh"
Update or uncomment the extraVolumes section inside the coordinator section and add the following configuration:
extraVolumes: - name: dremio-privacera-install configMap: name: dremio-privacera-install defaultMode: 0777 - name: dremio-privacera-plugin-volume emptyDir: {} - name: dremio-ext-jars-volume emptyDir: {} - name: dremio-privacera-config emptyDir: {}
Update or uncomment the extraVolumeMounts section inside the coordinator section and add the following configuration:
extraVolumeMounts: - name: dremio-ext-jars-volume mountPath: /opt/privacera/dremio-ext-jars - name: dremio-privacera-plugin-volume mountPath: /opt/dremio/plugins/authorizer - name: dremio-privacera-config mountPath: /opt/privacera/conf
Upgrade your Helm release. Get the release name by running
helm list
command. The text under the Name column is your Helm release.helm upgrade -f values.yaml <release-name>
RPM
To deploy RPM:
SSH to your instance where Dremio RPM is installed.
Copy the
privacera_dremio.sh
andprivacera_custom_conf.tar.gz
files from your Privacera Manager host instance to the Home folder in your Dremio instance.Rum the following commands:
mkdir -p ~/privacera/install mv privacera_dremio.sh ~/privacera/install mv privacera_custom_conf.tar.gz ~/privacera/install
Launch the
privacera_dremio.sh
script.cd ~/privacera/install chmod +x privacera_dremio.sh sudo ./privacera_dremio.sh
Update
dremio-env
to add Privacera jars and configuration in the Dremio classpath.vi ${DREMIO_HOME}/conf/dremio-env
Add the following variable, or update it if it already exists:
DREMIO_EXTRA_CLASSPATH=/opt/privacera/conf:/opt/privacera/dremio-ext-jars/*
Restart Dremio.
sudo service dremio restart
AWS EMR
This topic shows how to configure AWS EMR with Privacera using Privacera Manager.
Configuration
SSH to the instance as USER.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.emr.yml config/custom-vars/ vi config/custom-vars/vars.emr.yml
Edit the following properties.
Property
Description
Example
EMR_ENABLE
Enable EMR template creation.
true
EMR_CLUSTER_NAME
Define a unique name for the EMR cluster.
Privacera-EMR
EMR_CREATE_SG
Set this to true if you don't have existing security groups and want Privacera Manager to take care of adding security group creation steps in the EMR CF template.
false
EMR_MASTER_SG_ID
If EMR_CREATE_SG is false, set this property. Security Group ID for EMR Master Node Group.
sg-xxxxxxx
EMR_SLAVE_SG_ID
If EMR_CREATE_SG is false, set this property. Security Group ID for EMR Slave Node Group.
sg-xxxxxxx
EMR_SERVICE_ACCESS_SG_ID
If EMR_CREATE_SG is false, set this property. Security Group ID for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a Private Network.
sg-xxxxxxx
EMR_SG_VPC_ID
If EMR_CREATE_SG is true, set this property. VPC ID in which you want to create the EMR Cluster.
vpc-xxxxxxxxxxx
EMR_MASTER_SG_NAME
If EMR_CREATE_SG is true, set this property. Security Group Name for EMR Master Node Group. The security group name will be added to the
emr-template.json
.priv-master-sg
EMR_SLAVE_SG_NAME
If EMR_CREATE_SG is true, set this property. Security Group Name for EMR Slave Node Group. The security group name will be added to the
emr-template.json
.priv-slave-sg
EMR_SERVICE_ACCESS_SG_NAME
If EMR_CREATE_SG is true, set this property. Security Group Name for EMR ServiceAccessSecurity. The security group name will be added to the
emr-template.json
. Fill this property only if you are creating EMR in a Private Network.priv-private-sg
EMR_SUBNET_ID
Subnet ID
EMR_KEYPAIR
An existing EC2 key pair to SSH into the master node of the cluster.
privacera-test-pair
EMR_EC2_MARKET_TYPE
Set market type as SPOT or ON_DEMAND.
SPOT
EMR_EC2_INSTANCE_TYPE
Set the instance type. Instances can be of different types such as m5.xlarge, r5.xlarge and so on.
m5.large
EMR_MASTER_NODE_COUNT
Node count for Master. The number of nodes can be 1, 2 and so on.
1
EMR_CORE_NODE_COUNT
Node count for Core. The number of cores can be 1, 2 and so on.
1
EMR_VERSION
Version of EMR.
emr-x.xx.x
EMR_EC2_DOMAIN
Domain used by the nodes. It depends on EMR Region, for example, ".ec2.internal" is for us-east-1.
.ec2.internal
EMR_USE_STS_REGIONAL_ENDPOINTS
Set the property to enable/disable regional endpoints for S3 requests.
Default value is
false
.true
EMR_TERMINATION_PROTECT
Set to enable/disable termination protection.
true
EMR_LOGS_PATH
S3 location for storing EMR logs.
s3://privacera-logs-bucket/
EMR_KERBEROS_ENABLE
Set to true if you want to enable kerberization on EMR.
false
EMR_KDC_ADMIN_PASSWORD
If EMR_KERBEROS_ENABLE is true, set this property. The password used within the cluster for the kadmin service.
EMR_CROSS_REALM_PASSWORD
If EMR_KERBEROS_ENABLE is true, set this property. The cross-realm trust principal password, which must be identical across realms.
EMR_SECURITY_CONFIG
Name of the Security Configurations created for EMR. This can be a pre-created configuration, or Privacera Manager can generate a template through which you can create this configuration.
EMR_KERB_TICKET_LIFETIME
Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The period for which a Kerberos ticket issued by the cluster’s KDC is valid. Cluster applications and services auto-renew tickets after they expire.
EMR_KERB_TICKET_LIFETIME: 24
EMR_KERB_REALM
Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The Kerberos realm name for the other realm in the trust relationship.
EMR_KERB_DOMAIN
Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The domain name of the other realm in the trust relationship.
EMR_KERB_ADMIN_SERVER
Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The fully qualified domain name (FQDN) and an optional port for the Kerberos admin server in the other realm. If a port is not specified, 749 is used.
EMR_KERB_KDC_SERVER
Set this property if you want Privacera Manager to create CF template for creating security configuration and EMR_KERBEROS_ENABLE is true. The fully qualified domain name (FQDN) and an optional port for the KDC in the other realm. If a port is not specified, 88 is used.
EMR_AWS_ACCT_ID
AWS Account ID where EMR Cluster resides
9999999
EMR_DEFAULT_ROLE
Default role attached to EMR Cluster for performing cluster-related activities. This should be a pre-created role.
EMR_DefaultRole
EMR_ROLE_FOR_CLUSTER_NODES
The IAM Role will be attached to each node in the EMR Cluster.
This should have only minimal permissions for downloading the
privacera_cust_conf.zip
and basic EMR capabilities. It can be an existing one, if not, you can use the IAM role CF template to generate it after the Privacera Manager update.restricted_node_role
EMR_USE_SINGLE_ROLE_FOR_APPS
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. Create a Single IAM Role that will be used by All EMR Applications.
true
EMR_ROLE_FOR_APPS
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by all EMR Apps
app_data_access_role
EMR_ROLE_FOR_SPARK
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. Create multiple IAM Roles to be used by specific applications. Set EMR_USE_SINGLE_ROLE_FOR_APPS to be false. IAM Role name which will be used by Spark Application (Dataserver) for data access.
spark_data_access_role
EMR_ROLE_FOR_HIVE
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by Hive Application for data access.
hive_data_access_role
EMR_ROLE_FOR_PRESTO
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by Presto Application for data access.
presto_data_access_role
EMR_HIVE_METASTORE
Metastore type. e.g. "glue", "hive" (For external hive-metastore)
glue
EMR_HIVE_METASTORE_PATH
S3 location for hive metastore
s3://hive-warehouse
EMR_HIVE_METASTORE_CONNECTION_URL
If EMR_HIVE_METASTORE is hive, set this property. JDBC Connection URL for connecting to hive.
jdbc:mysql://<jdbc-host>:3306/<hive-db-name>?createDatabaseIfNotExist=true
EMR_HIVE_METASTORE_CONNECTION_DRIVER
If EMR_HIVE_METASTORE is hive, set this property. JDBC Driver Name
org.mariadb.jdbc.Driver
EMR_HIVE_METASTORE_CONNECTION_USERNAME
If EMR_HIVE_METASTORE is hive, set this property. JDBC UserName
hive
EMR_HIVE_METASTORE_CONNECTION_PASSWORD
If EMR_HIVE_METASTORE is hive, set this property. JDBC Password
StRong@PassW0rd
EMR_HIVE_SERVICE_NAME
Custom hive service name for hive application in EMR
teamA_policy
EMR_TRINO_HIVE_SERVICE_NAME
Custom hive service name for trino application in EMR
teamB_policy
EMR_SPARK_HIVE_SERVICE_NAME
Custom hive access service name for spark applications in EMR
teamC_policy
EMR_APP_SPARK_OLAC_ENABLE
To install Spark application with Privacera plugin, set the property to true. OLAC is known as Object Level Access Control.
Note:
Recommended when complete access control on the objects in AWS S3 is required.
When the property is set to true, s3 and s3n protocols will not be supported on EMR clusters while running Spark queries.
true
EMR_APP_SPARK_FGAC_ENABLE
To install Spark application with Privacera plugin, set the property to true. FGAC is known as Fine Grained Access Control for Table and Column.
Note: Recommended for compliance purposes, since the whole cluster will still have direct access to AWS S3 data.
false
EMR_APP_PRESTO_DB_ENABLE
To install PrestoDB application with Privacera plugin, set the property to true.
PrestoDB and Trino are mutually exclusive. Only one should be enabled at a time.
false
EMR_APP_PRESTO_SQL_ENABLE
To install Trino application with Privacera plugin, set the property to true.
PrestoDB and Trino are mutually exclusive. Only one should be enabled at a time.
Note: Trino is supported for EMR versions 6.1.0 and higher.
Note: If the EMR version is 6.4.0, setting this flag installs the Trino plugin.
false
EMR_APP_HIVE_ENABLE
To install Hive application with Privacera plugin, set the property to true.
true
EMR_APP_ZEPPELIN_ENABLE
To install Zeppelin application, set the property to true.
true
EMR_APP_LIVY_ENABLE
To install Livy application, set the property to true.
true
EMR_CUST_CONF_ZIP_PATH
A path where the
privacera_cust_conf.zip
file will be placed should be added. Privacera Manager will generate aprivacera_cust_conf.zip
under~/privacera/privacera-manager/output/emr
folder. Thisprivacera_cust_conf.zip
needs to be placed at an s3 or any https location from which the EMR cluster can download it.s3://privacera-artifacts/
EMR_SPARK_ENABLE_VIEW_LEVEL_ACCESS_CONTROL
Set the property to true to enable view-level column masking and row filter for SparkSQL. The property can be used only when you set
EMR_APP_SPARK_FGAC_ENABLE
totrue
.To learn how to use view-level access control in Spark, click here.
false
EMR_RANGER_IS_FALLBACK_SUPPORTED
Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.
To enable the fallback, set to true; to disable, set to false.
true
EMR_SPARK_DELTA_LAKE_ENABLE
Set this property to true to enable Delta Lake on EMR Spark.
true
EMR_SPARK_DELTA_LAKE_CORE_JAR_DOWNLOAD_URL
Download URL of Delta Lake core JAR. The Delta Lake core JAR has dependency with Spark version.
You have to find the appropriate version for your EMR. See Delta Lake compatibility with Apache Spark.
Get the appropriate Delta Lake core JAR download link and update the property. See Delta Core.
For example, for Spark version 3.1.x, the download URL is
https://repo1.maven.org/maven2/io/delta/delta-core_2.12/1.0.1/delta-core_2.12-1.0.1.jar
.https://repo1.maven.org/maven2/io/delta/delta-core_2.12/1.0.1/delta-core_2.12-1.0.1.jar
If your cluster was running while External Hive Metastore was down, and you are unable to connect to it, restart the following three servers.
sudo systemctl restart hive-hcatalog-server sudo systemctl restart hive-server2 sudo systemctl restart presto-server
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is finished, all the cloud-formation JSON template files and
privacera_cust_conf.zip
will be available at the path,~/privacera/privacera-manager/output/emr
.Configure and run the following in AWS instance where Privacera is installed.
(Optional) Create IAM roles using the
emr-roles-creation-template.json
template. Run the following command.aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-role-creation --template-body file://emr-roles-creation-template.json --capabilities CAPABILITY_NAMED_IAM
Note
This will create IAM roles with minimal permissions. You can add bucket permissions into respective IAM roles as per your requirements.
(Optional) Create Security Configurations using the
emr-security-config-template.json
template. Run the following command.aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-security-config-creation --template-body file://emr-security-config-template.json
Confirm the
privacera_cust_conf.zip
file has been copied to the location specified inEMR_CUST_CONF_ZIP_PATH
.Create EMR using the
emr-template.json
template. Run the following command.aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-creation --template-body file://emr-template.json
Note
If you are upgrading EMR to version 6.4 and higher from EMR version <=6.3 to use Trino plug-in, then you must re-create the EMR security configuration based on the new template generated via PM since the security configuration has
trino
user newly added
Note
For PrestoDB, secrets encryption of Solr authentication password is not supported. However, the properties file where the password resides is accessible only to the presto service user, hence it is invulnerable.
If your cluster was running while External Hive Metastore was down, and you are unable to connect to it, restart the following three servers:
sudo systemctl restart hive-hcatalog-server sudo systemctl restart hive-server2 sudo systemctl restart presto-server
AWS EMR with Native Apache Ranger
AWS EMR provides native Apache Ranger integration with the open source Apache Ranger plugins for Apache Spark and Hive. By connecting EMR’s native Ranger with Privacera’s Ranger-based data access governance, it gives the following key advantages:
Companies will have the ability to sync their existing policies with their EMR solution.
Extend Apache Ranger’s open source capabilities to take advantage of Privacera’s centralized enterprise-ready solution.
Note
Supported EMR version: 5.32 and above in EMR 5.x series.
Prerequisites
AWS Secrets are required for the following to store the Ranger Admin and Ranger plugin certificates.
ranger-admin-pub-cert
ranger-plugin-private-keypair
To create the two secrets in AWS Secret Manager, do the following:
Login to AWS console and navigate to Secrets Manager and then click Store a new secret option.
Select secret type as Other type of secrets and then go to the Plaintext tab. Keep the Default value unchanged. The actual value for this secret will be obtained after the installation is done.
Select the encryption key as per your requirement.
Click Next.
Under Secret name, type a name for the secret in the text field. For example: ranger-admin-pub-cert, ranger-plugin-private-keypair.
Click Next. The Configure automatic rotation page is displayed.
Click Next.
On the Review page, you can check your secret settings and then click Store to save your changes.
The Secret is stored successfully.
Configuration
SSH to the instance as USER.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.emr.native.ranger.yml config/custom-vars/ vi config/custom-vars/vars.emr.native.ranger.yml
Edit the following properties.
Property
Description
Example
EMR_NATIVE_ENABLE
Property to enable EMR native Ranger integration.
EMR_NATIVE_ENABLE: "true"
Properties for EMR Specifications
EMR_NATIVE_CLUSTER_NAME
Name of the EMR Cluster.
EMR_NATIVE_CLUSTER_NAME: "Privacera-EMR-Native-Ranger"
EMR_NATIVE_AWS_REGION
AWS Region where the cluster will reside.
EMR_NATIVE_AWS_REGION: "{{AWS_REGION}}"
EMR_NATIVE_AWS_ACCT_ID
AWS Account ID where the EMR Cluster and its resources will reside.
EMR_NATIVE_AWS_ACCT_ID: "587946681758"
EMR_NATIVE_SUBNET_ID
Subnet ID where the EMR Cluster nodes will reside.
EMR_NATIVE_SUBNET_ID: ""
EMR_NATIVE_KEYPAIR
An existing EC2 key pair to SSH into the node of cluster
EMR_NATIVE_KEYPAIR: "privacera-test-pair"
EMR_NATIVE_EC2_MARKET_TYPE
Market Type for the EMR Cluster nodes. For example, SPOT or ON_DEMAND.
EMR_NATIVE_EC2_MARKET_TYPE: "SPOT"
EMR_NATIVE_EC2_INSTANCE_TYPE
Instance Type for the EMR Cluster nodes.
EMR_NATIVE_EC2_INSTANCE_TYPE: "m5.2xlarge"
EMR_NATIVE_MASTER_NODE_COUNT
Node count for Master.
EMR_NATIVE_MASTER_NODE_COUNT: "1"
EMR_NATIVE_CORE_NODE_COUNT
Node count for Core.
EMR_NATIVE_CORE_NODE_COUNT: "1"
EMR_NATIVE_VERSION
EMR Native Ranger integation is supported from 5.32 and above.
EMR_NATIVE_VERSION: "emr-5.32.0"
EMR_NATIVE_TERMINATION_PROTECT
To enable termination protection.
EMR_NATIVE_TERMINATION_PROTECT: "true"
EMR_NATIVE_LOGS_PATH
S3 location for EMR logs storage.
EMR_NATIVE_LOGS_PATH: "s3://privacera-emr/logs"
Properties to configure EMR Security Group
EMR_NATIVE_CREATE_SG
Set this to true, if you don't have existing security groups and want Privacera Manager to take care of adding security groups creation steps in EMR CloudFormation Template.
EMR_NATIVE_CREATE_SG: "false"
If
EMR_NATIVE_CREATE_SG
is false, fill the following properties with existing security group ids:EMR_NATIVE_MASTER_SG_ID
Security Group ID for EMR Master Node Group.
EMR_NATIVE_MASTER_SG_ID: "sg-xxxxxxx"
EMR_NATIVE_SLAVE_SG_ID
Security Group ID for EMR Slave Node Group.
EMR_NATIVE_SLAVE_SG_ID: "sg-xxxxxxx"
EMR_NATIVE_SERVICE_ACCESS_SG_ID
Security Group ID for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a private network.
EMR_NATIVE_SERVICE_ACCESS_SG_ID: "sg-xxxxxxx"
If
EMR_NATIVE_CREATE_SG
is true, fill the following properties to give security group names for new groups which will be added inemr-template.json
:EMR_NATIVE_SG_VPC_ID
VPC ID in which you want to create the EMR Cluster.
EMR_NATIVE_SG_VPC_ID: "vpc-xxxxxxxxxxx"
EMR_NATIVE_MASTER_SG_NAME
Security Group Name for EMR Master Node Group.
EMR_NATIVE_MASTER_SG_NAME: "priv-master-sg"
EMR_NATIVE_SLAVE_SG_NAME
Security Group Name for EMR Slave Node Group.
EMR_NATIVE_SLAVE_SG_NAME: "priv-slave-sg"
EMR_NATIVE_SERVICE_ACCESS_SG_NAME
Security Group Name for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a private network.
EMR_NATIVE_SERVICE_ACCESS_SG_NAME: "priv-private-sg"
EMR_NATIVE_SECURITY_CONFIG
Name of the security configurations created for EMR. This can be an existing configuration or Privacera Manager can generate a template through which new configurations can be created. The new template will be available at
~/privacera/privacera-manager/output/emr/emr-native-sec-config-template.json
after you run the Privacera Manager update command.EMR_NATIVE_SECURITY_CONFIG: ""
Properties for EMR Hive Metastore
EMR_NATIVE_HIVE_METASTORE
Metastore type. For example, internal, hive (For external hive-metastore)
EMR_NATIVE_HIVE_METASTORE: "hive"
EMR_NATIVE_HIVE_METASTORE_WAREHOUSE_PATH
S3 location for Hive metastore warehouse
EMR_NATIVE_HIVE_METASTORE_WAREHOUSE_PATH: "s3://hive-warehouse"
Fill the following properties, if
EMR_NATIVE_HIVE_METASTORE
is hive:EMR_NATIVE_METASTORE_CONNECTION_URL
JDBC Connection URL for connecting to Hive Metastore.
EMR_NATIVE_METASTORE_CONNECTION_URL:
jdbc:mysql://<jdbc-host>:3306/<hive-db-name>?createDatabaseIfNotExist=true
EMR_NATIVE_METASTORE_CONNECTION_DRIVER
JDBC Driver Name
EMR_NATIVE_METASTORE_CONNECTION_DRIVER: "org.mariadb.jdbc.Driver"
EMR_NATIVE_METASTORE_CONNECTION_USERNAME
JDBC UserName
EMR_NATIVE_METASTORE_CONNECTION_USERNAME: "hive"
EMR_NATIVE_METASTORE_CONNECTION_PASSWORD
JDBC Password
EMR_NATIVE_METASTORE_CONNECTION_PASSWORD: "StRong@PassWord"
Properties of Kerberos Server
EMR_NATIVE_KDC_ADMIN_PASSWORD
The password used within the cluster for the kadmin service.
EMR_NATIVE_KDC_ADMIN_PASSWORD: ""
EMR_NATIVE_CROSS_REALM_PASSWORD
The cross-realm trust principal password, which must be identical across realms.
EMR_NATIVE_CROSS_REALM_PASSWORD: ""
EMR_NATIVE_KERB_TICKET_LIFETIME
The period for which a Kerberos ticket issued by the cluster’s KDC is valid. Cluster applications and services auto-renew tickets after they expire.
EMR_NATIVE_KERB_TICKET_LIFETIME: 24
EMR_NATIVE_KERB_REALM
The Kerberos realm name for the other realm in the trust relationship.
EMR_NATIVE_KERB_REALM: ""
EMR_NATIVE_KERB_DOMAIN
The domain name of the other realm in the trust relationship.
EMR_NATIVE_KERB_DOMAIN: ""
EMR_NATIVE_KERB_ADMIN_SERVER
The fully qualified domain name (FQDN) and optional port for the Kerberos admin server in the other realm. If a port is not specified, 749 is used.
EMR_NATIVE_KERB_ADMIN_SERVER: ""
EMR_NATIVE_KERB_KDC_SERVER
The fully qualified domain name (FQDN) and optional port for the KDC in the other realm. If a port is not specified, 88 is used.
EMR_NATIVE_KERB_KDC_SERVER: ""
Properties of Certificates Secrets
EMR_NATIVE_RANGER_PLUGIN_SECRET_ARN
Full ARN of AWS secret [stored in AWS Secrets Manager] for Ranger plugin key-pair. This is the secret created in the Prerequisites step above.
EMR_NATIVE_RANGER_PLUGIN_SECRET_ARN: "arn:aws:secretsmanager:us-east-1:99999999999:secret:ranger-plugin-key-pair-ixZbO2"
EMR_NATIVE_RANGER_ADMIN_SECRET_ARN
Full ARN of AWS secret [stored in AWS Secrets Manager] for Ranger admin public certificate. This is the secret created in the Prerequisites step above.
EMR_NATIVE_RANGER_ADMIN_SECRET_ARN: "arn:aws:secretsmanager:us-east-1:99999999999:secret:ranger-admin-public-cert-ixfCO5"
Properties of EMR application
EMR_NATIVE_APP_SPARK_ENABLE
Installs Spark application with EMR native Ranger plugin, if set to true.
EMR_NATIVE_APP_SPARK_ENABLE: "true"
EMR_NATIVE_APP_HIVE_ENABLE
Installs Hive application with EMR native Ranger plugin, if set to true.
EMR_NATIVE_APP_HIVE_ENABLE: "true"
EMR_NATIVE_APP_ZEPPELIN_ENABLE
Installs Zeppelin application, if set to true.
EMR_NATIVE_APP_ZEPPELIN_ENABLE: "true"
EMR_NATIVE_APP_LIVY_ENABLE
Installs Livy application, if set to true.
EMR_NATIVE_APP_LIVY_ENABLE: "true"
Properties of IAM Role Configuration
EMR_NATIVE_DEFAULT_ROLE
Default role attached to EMR cluster for performing cluster related activities. This should be an existing role.
EMR_NATIVE_DEFAULT_ROLE: "EMR_DefaultRole"
EMR_NATIVE_INSTANCE_ROLE
The IAM Role which will be attached to each node in the EMR Cluster. This should have only minimal permissions for basic EMR functionalities.
EMR_NATIVE_INSTANCE_ROLE: "restricted_instance_role"
EMR_NATIVE_DATA_ACCESS_ROLE
This role provides credentials for trusted execution engines, such as Apache Hive and AWS EMR Record Server AWS EMR Components, to access AWS S3 data. Use this role only to access AWS S3 data, including any KMS keys, if you are using S3 SSE-KMS.
EMR_NATIVE_DATA_ACCESS_ROLE: "emr_native_data_access_role"
EMR_NATIVE_USER_ACCESS_ROLE
This role provides users who are not trusted execution engines with credentials to interact with AWS services, if needed. Do not use this IAM role to allow access to AWS S3 data, unless its data that should be accessible by all users.
EMR_NATIVE_USER_ACCESS_ROLE: "emr_native_user_access_role"
Properties to send EMR Ranger Engines Audits to Solr
EMR_NATIVE_ENABLE_SOLR_AUDITS
Enable audits to Solr.
EMR_NATIVE_ENABLE_SOLR_AUDITS: "true"
AUDITSERVER_AUTH_TYPE
EMR Native Ranger Audits Frameworks does not support basic authentication, hence this needs to be disabled. This property needs to changed in
vars.auditserver.yml
, if already existing.AUDITSERVER_AUTH_TYPE: "none"
AUDITSERVER_SSL_ENABLE
Incase of self-signed SSL, EMR native Ranger does not support SSL for Solr audits. Hence, AuditServer SSL should be disabled.
AUDITSERVER_SSL_ENABLE: "false"
EMR_NATIVE_CLOUDWATCH_GROUPNAME
Add a CloudWatch LogGroup to push Ranger Audits. This should be an existing Group.
EMR_NATIVE_CLOUDWATCH_GROUPNAME: "emr_privacera_native_logs"
Note
You can also add custom properties that are not included by default. See EMR.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Once update is done, all the CloudFormation JSON template files will be available at ~/privacera/privacera-manager/output/emr-native-ranger path.
Run the following command in the AWS instance where Privacera is installed.
cd ~/privacera/privacera-manager/output/emr-native-ranger
Create the certificates which needs to be added in AWS Secrets Manager.
You will get multiple prompts to enter the keystore password. Use the property value of
RANGER_PLUGIN_SSL_KEYSTORE_PASSWORD
set in~/privacera/privacera-manager/config/custom-vars/vars.ssl.yml
for each prompt.Run the following command.
./emr-native-create-certs.sh
This will create the following two files. You need to update the secrets in both the files, which was created in the Prerequisites section above:
ranger-admin-pub-cert.pem
ranger-plugin-keypair.pem
Display the contents of the
ranger-admin-pub-cert.pem
file.cat ranger-admin-pub-cert.pem
Select the file contents and then right-click in the terminal to copy the contents.
Login to AWS console and navigate to Secrets Manager and then click ranger-admin-pub-cert.
Navigate to Secret value section and then go to Retrieve Secret Value > Edit > Plaintext.
Replace the secrets with the new value, which you copied in step 2.
Similarly, follow the steps b-e above to display the file contents of
ranger-plugin-keypair.pem
and use the contents to replace the value of theranger-plugin-private-keypair
secrets in the AWS Secrets Manager.
(Optional) Create IAM roles using the emr-native-role-creation-template.json template.
aws --region <AWS_REGION> cloudformation create-stack --stack-name privacera-emr-native-role-creation --template-body file://emr-native-role-creation-template.json --capabilities CAPABILITY_NAMED_IAM
Note
For giving access to data for Apache Hive and Apache Spark services, navigate to IAM Management in your AWS Console and add required S3 policies in the
EMR_NATIVE_DATA_ACCESS_ROLE
.(Optional) Create Security Configurations using the emr-native-sec-config-template.json template.
aws --region <AWS_REGION> cloudformation create-stack --stack-name privacera-emr-native-security-config-creation --template-body file://emr-native-sec-config-template.json
Create EMR using the emr-native-template.json template.
aws --region <AWS_REGION> cloudformation create-stack --stack-name privacera-emr-native-creation --template-body file://emr-native-template.json
GCP Dataproc
Privacera plugin in Dataproc
This section covers how you can use Privacera Manager to generate the setup script and Dataproc custom configuration to install Privacera Plugin in the GCP Dataproc environment.
Prerequisites
Ensure the following prerequisites are met:
A working Dataproc environment.
Privacera services must be up and running.
Configuration
SSH to the instance where Privacera is installed.
Run the following command:
cd ~/privacera/privacera-manager cp config/sample-vars/vars.dataproc.yml config/custom-vars/ vi config/custom-vars/vars.dataproc.yml
Edit the following properties:
Property
Description
Example
DATAPROC_ENABLE
Enable Dataproc template creation.
true
DATAPROC_MANAGE_INIT_SCRIPT
Set this property to upload the init script to GCP Cloud Storage.
If the value is set to
true
, then Privacera will upload the init script to the GCP bucket.If the value is set to
false
, then manually upload the init script to a GCP bucket.false
DATAPROC_PRIVACERA_GS_BUCKET
Enter the GCP bucket name where the init script will be uploaded.
gs://privacera-bucket
DATAPROC_RANGER_IS_FALLBACK_SUPPORTED
Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.
To enable the fallback, set to true; to disable, set to false.
true
Run the update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is complete, the setup script
setup_dataproc.sh
and Dataproc custom configurationsprivacera_cust_conf.zip
will be generated at the path,~/privacera/privacera-manager/output/dataproc
.If
DATAPROC_MANAGE_INIT_SCRIPT
is set tofalse
, then copy setup_dataproc.sh and privacera_cust_conf.zip. Both the files should be placed under the same folder.cd ~/privacera/privacera-manager/output/dataproc GS_BUCKET=<PLEASE_CHANGE> gsutil cp setup_dataproc.sh gs://${GS_BUCKET}/privacera/dataproc/init/ gsutil cp privacera_cust_conf.zip gs://${GS_BUCKET}/privacera/dataproc/init/
SSH to the instance where the master node of the Dataproc is installed. Then, enter the GCP bucket name and run the setup script.
sudo su - mkdir -p /opt/privacera/downloads cd /opt/privacera/downloads GS_BUCKET=privacera-dev gsutil cp gs://${GS_BUCKET}/privacera/dataproc/init/setup_dataproc.sh . chmod +x setup_dataproc.sh ./setup_dataproc.sh
Starburst Enterprise
Starburst Enterprise with Privacera
Using Privacera in Starburst Enterprise LTS, you can enforce system-wide access control. The following information can help provide an expedient way of configuring Starburst Enterprise with port 8443 for TLS/HTTPS so that usernames/passwords are possible. Self-signed certificates work well for testing purposes, but not to be used for production deployments.
Prerequisites
The following items need to be enabled/shared prior to deploying a Starburst Docker image:
A licensed version of Starburst
Docker-ce 18+ must be installed
JDK 11 (to generate the Java keystore)
Privacera Manager version 4.7 or higher
JDBC URL to connect to the Starburst Enterprise instance to access the catalogs and schemas
CA-signed SSL certificate for production deployment.
Configuring Privacera Plugin with Starburst Enterprise
Summary of steps:
Generate an access-control file for Starburst.
Generate an access-control file for Hive catalogs [optional].
Generate a Ranger Audit XML file.
Generate a Ranger SSL XML file required for TLS secure Privacera installations.
To configure Privacera plugin:
To enable Privacera for authorization, you need to update the etc/config.properties with one of the following entries:
# privacera auth for hive and system access control access-control.config-files=/etc/starburst/access-control-privacera.properties,/etc/starburst/access-control-priv-hive.properties
Or
# privacera auth for only system access control access-control.config-files=/etc/starburst/access-control-privacera.properties
Edit etc/access-control-privacera.properties. The following is an example of the properties. You need to configure the properties in the file, so that it points to the instance where Privacera is installed. Replace
<PRIVACERA_HOST_INSTANCE_IP>
with the IP address of Privacera host.access-control.name=privacera-starburst ranger.policy-rest-url=http://<PRIVACERA_HOST_INSTANCE_IP>:6080 ranger.service-name=privacera_starburstenterprise ranger.username=admin ranger.password=welcome1 ranger.policy-refresh-interval=3s ranger.config-resources=/etc/starburst/ranger-hive-audit.xml ranger.policy-cache-dir=/etc/starburst/tmp/ranger
To install this file into the Docker container, you can add option to your container creation script:
-v $DOCKER_HOME/$STARBURST_VERSION/etc/access-control-privacera.properties:$STARBURST_TGT/access-control-privacera.properties \
Edit etc/access-control-priv-hive.properties. The following is an example of the properties. You need to configure the properties in the file, so that it points to the instance where Privacera is installed. Replace
<PRIVACERA_HOST_INSTANCE_IP>
with the IP address of Privacera host. Similarly, you need to configure the properties of the comma-separated files such as Hive, Glue, Delta, and so on.This file is optional if you are not configuring Hive catalogs with privacera_hive policies.
access-control.name=privacera ranger.policy-rest-url=http://<PRIVACERA_HOST_INSTANCE_IP>:6080 ranger.service-name=privacera_hive privacera.catalogs=hive,glue ranger.username=admin ranger.password=welcome1 ranger.policy-refresh-interval=3s ranger.config-resources=/etc/starburst/ranger-hive-audit.xml ranger.policy-cache-dir=/etc/starburst/tmp/ranger privacera.fallback-access-control=allow-all
To install this file into the Docker container, you can add option to your container creation script:
-v $DOCKER_HOME/$STARBURST_VERSION/etc/access-control-priv-hive.properties:$STARBURST_TGT/access-control-priv-hive.properties \
Edit etc/ranger-hive-audit.xml. This file describes the method of auditing the access from Starburst to Privacera Ranger and Solr. The example below is for unsecured Privacera Ranger deployments only. Replace
<PRIVACERA_HOST_INSTANCE_IP>
with the IP address of Privacera host.<?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>ranger.plugin.hive.service.name</name> <value>privacera_hive</value> </property> <property> <name>ranger.plugin.hive.policy.pollIntervalMs</name> <value>5000</value> </property> <property> <name>ranger.service.store.rest.url</name> <value>http://<PRIVACERA_HOST_INSTANCE_IP>:6080</value> </property> <property> <name>ranger.plugin.hive.policy.rest.url</name> <value>http://<PRIVACERA_HOST_INSTANCE_IP>:6080</value> </property> <property> <name>xasecure.audit.destination.solr</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.solr.batch.filespool.dir</name> <value>/opt/presto/logs/audits/solr/</value> </property> <property> <name>xasecure.audit.destination.solr.urls</name> <value>http://<PRIVACERA_HOST_INSTANCE_IP>:8983/solr/ranger_audits</value> </property> <property> <name>xasecure.audit.is.enabled</name> <value>true</value> </property> </configuration>
To install this file into the Docker container, you can add option to your container creation script:
-v $DOCKER_HOME/$STARBURST_VERSION/etc/ranger-hive-audit.xml:$STARBURST_TGT/ranger-hive-audit.xml \
Privacera services (Data Assets)
Privacera services
This topic covers how you can enable/disable Data Sets menu on Privacera Portal.
Data Sets allows you to create logical data assets from various data sources such Snowflake, PostgreSQL and so on, and share the data assets with users, groups or roles. You can assign an owner to a data asset who has the privileges to control access to the data within the data asset.
CLI configuration
Run the following command.
cd privacera/privacera-manager/ cp config/sample-vars/vars.privacera-services.yml config/custom-vars/ vi config/custom-vars/vars.privacera-services.yml
Enable/Disable the property.
PRIVACERA_SERVICES_ENABLE:"true"
Run the following command.
cd privacera/privacera-manager/ ./privacera-manager update
Audit Fluentd
Prerequisites
Ensure the following prerequisites are met:
AuditServer must be up and running. For more information, refer to AuditServer.
If you're configuring Fluentd for an Azure environment and want to configure User Managed Service Identity (MSI), assign the following two IAM roles to the Azure Storage account for the User Managed Service Identity where the audits will be stored.
Owner or Contributor
Storage Blob Data Owner or Storage Blob Data Contributor
Note
If your Azure environment is Docker-based, then configure MSI on a virtual machine, whereas for a Kubernetes-based environment, configure MSI on a virtual machine scale set (VMSS).
This topic covers how you can store the audits from AuditServer locally, or on a cloud, for example, AWS S3, Azure blob, and Azure ADLS Gen 2. You can also send application logs to the same location as the audit logs.
Procedure
SSH to the instance where Privacera is installed.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.audit-fluentd.yml config/custom-vars/ vi config/custom-vars/vars.audit-fluentd.yml
Modify the properties below. For property details and description, refer to the Configuration Properties below.
You can also add custom properties that are not included by default. See Audit Fluentd.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
| Set the audit destination where the audits will be saved. If the value is set to S3, the audits get stored in the AWS S3 server. For S3, the default time interval to publish the audits is 3600s (1hr). Local storage should be used only for development and testing purposes. All the audit received are stored in the same container/pod. Value: | s3 |
| Specifies whether application logs and PolicySync logs are sent to Fluentd. The default value is |
|
When the destination is | ||
| This is the time interval after which the audits will be pushed to the local destination. | 3600s |
When the destination is | ||
| Set the bucket name, if you set the audit destination above to S3. Leave unchanged, if you set the audit destination to local. | bucket_1 |
| Set the bucket region, if you set the audit destination above to S3. Leave unchanged, if you set the audit destination to local. | us-east-1 |
| This is the time interval after which the audits will be pushed to the S3 destination. | 3600s |
| Set the access and secret key, if you set the audit destination above to S3. Leave unchanged, if you set the audit destination to local and are using AWS IAM Instance Role. | AUDIT_FLUENTD_S3_ACCESS_KEY: "AKIAIOSFODNN7EXAMPLE" AUDIT_FLUENTD_S3_SECRET_KEY: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" |
| Property to encrypt an S3 bucket. You can use the property, if you have set You can assign one of the following values as the encryption types:
SSE-S3 and SSE-KMS are encryptions managed by AWS. You need to enable the server-side encryption for the S3 bucket. For more information on how to enable SSE-S3 or SSE-KMS encryption types, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/default-bucket-encryption.html SSE-C is the custom encryption type, where the encryption key and MD5 have to generated separately. | NONE |
| If you have set | |
| If you have set To get the MD5 hash for the encryption key, run the following command: echo -n "<generated-key>"| openssl dgst -md5 -binary | openssl enc -base64 | |
When the destination is | ||
| Set the storage account and the container, if you set the audit destination above to Azure Blob or Azure ADLS. To know how to get the ADLS properties, see Get ADLS properties. Leave unchanged, if you set the audit destination to local. NoteCurrently, it supports Azure blob storage only. | AUDIT_FLUENTD_AZURE_STORAGE_ACCOUNT: "storage_account_1" AUDIT_FLUENTD_AZURE_CONTAINER: "container_1" |
| This is the time interval after which the audits will be pushed to the Azure ADLS/Blob destination. | 3600s |
| Select an authentication type from the dropdown list. | |
| Configure this property, if you have selected Set the storage account key and the SAS token, if you set the audit destination above to Azure Blob. Leave unchanged, if you're using Azure's Managed Identity Service. | |
| Set the storage account key and the SAS token, if you set the audit destination above to Azure ADLS. Configure this property, if you have selected Leave unchanged, if you're using Azure's Managed Identity Service. | |
| Configure this property, if you have selected |
Related Information
For further reading, see:
Grafana
How to configure Grafana with Privacera
Privacera allows you to use Grafana as a metric and monitoring system. Grafana dashboards are pre-built in Privcera for services such as Dataserver, PolicySync and Usersync to monitor the health of the services. Grafana uses the time-series data from the Privacera services and turns them into graphs and visualizations.
Grafana uses Graphite's query to pull the time-series data and create charts and graphs based on this data.
Supported services
The following services are supported on Grafana:
Dataserver
PolicySync
Usersync
Configuration steps
To enable Grafana, run the following command. This will enable both Grafana and Graphite.
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.grafana.yml config/custom-vars/
Run the update.
cd ~/privacera/privacera-manager/ ./privacera-manager.sh
Note
After configuring Grafana, if the data does not appear on the dashboard, see Grafana service.
Ranger Tagsync
This topic shows how you can configure Ranger TagSync to synchronize the Ranger tag store with Atlas.
Configuration
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.ranger-tagsync.yml config/custom-vars/ vi config/custom-vars/vars.ranger-tagsync.yml
Edit the following properties.
Property
Description
Example
RANGER_TAGSYNC_ENABLE
Property to enable/disable the Ranger TagSync.
true
TAGSYNC_TAG_SOURCE_ATLAS_KAFKA_BOOTSTRAP_SERVERS
Kakfa bootstrap server where Atlas publishes the entities. Tagsync listens and pushes the mapping of Atlas entities and tags to Ranger.
kafka:9092
TAGSYNC_TAG_SOURCE_ATLAS_KAFKA_ZOOKEEPER_CONNECT
Zookeeper URL for Kafka.
zoo-1:2181
TAGSYNC_ATLAS_CLUSTER_NAME
Atlas cluster name.
privacera
TAGSYNC_TAGSYNC_ATLAS_TO_RANGER_SERVICE_MAPPING
(Optional) To map from Atlas Hive cluster-name to Ranger service-name, the following format is used:
clusterName,componentType,serviceName;clusterName2,componentType2,serviceName2
Note: There are no spaces in the above format.
For Hive, the notifications from Atlas include the name of the entities in the following format:
dbName@clusterName dbName.tblName@clusterName dbName.tblName.colName@clusterName
Ranger Tagsync needs to derive the name of the Hive service (in Ranger) from the above entity names. By default, Ranger computes Hive service name as: clusterName + “_hive".
If the name of the Hive service (in Ranger) is different in your environment, use following property to enable Ranger Tagsync to derive the correct Hive service name.
TAGSYNC_ATLAS_TO_RANGER_SERVICE_MAPPING = clusterName,hive,rangerServiceName
{{TAGSYNC_ATLAS_CLUSTER_NAME}},hive,privacera_hive;{{TAGSYNC_ATLAS_CLUSTER_NAME}},s3,privacera_s3
TAGSYNC_TAGSYNC_ATLAS_DEFAULT_CLUSTER_NAME
(Optional) Default cluster name configured for Atlas.
{{TAGSYNC_ATLAS_CLUSTER_NAME}}
TAGSYNC_TAG_SOURCE_ATLAS_KAFKA_ENTITIES_GROUP_ID
(Optional) Consumer Group Name to be used to consume Kafka events.
privacera_ranger_entities_consumer
Note
You can also add custom properties that are not included by default. See Ranger TagSync.
Run the following command.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Discovery
Discovery in Kubernetes
This section provides setup instructions for Privacera Discovery for a Kubernetes based deployment.
Prerequisites
Ensure the following prerequisite is met:
Privacera services must be deployed using Kubernetes.
Embedded Spark must be used.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.discovery.kubernetes.yml config/custom-vars/ vi custom-vars/vars.discovery.kubernetes.yml
Set value for the following. For property details and description, refer to the Configuration Properties below.
DISCOVERY_K8S_SPARK_MASTER: "${PLEASE_CHANGE}"
Configuration properties
To get the value of the variable, do the following:
Get the URL for Kubernetes master by executing
kubectl cluster-info
command.Copy the Kubernetes control plane URL and paste it.
Discovery on Databricks
Discovery on Databricks
This topic covers the installation of Privacera Discovery on Databricks.
Configuration
SSH to the instance as USER.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.discovery.databricks.yml config/custom-vars/ vi custom-vars/vars.discovery.databricks.yml
Add and provide the following details in custom-vars/vars.discovery.databricks.yml file if the Databricks plugin is not enabled. To configure Databricks plugin, see Configuration in Databricks Spark Fine-Grained Access Control Plugin (FGAC) (Python, SQL).
DATABRICKS_HOST_URL: "<PLEASE_UPDATE>" DATABRICKS_TOKEN: "<PLEASE_UPDATE>" DATABRICKS_WORKSPACES_LIST: - alias: DEFAULT databricks_host_url: "{{DATABRICKS_HOST_URL}}" token: "{{DATABRICKS_TOKEN}}"
Edit the following properties. For property details and description, refer to the Configuration Properties below.
AWS
DATABRICKS_DRIVER_INSTANCE_TYPE: "m5.xlarge" DATABRICKS_INSTANCE_TYPE: "m5.xlarge" DATABRICKS_DISCOVERY_MANAGE_INIT_SCRIPT: "true" DATABRICKS_DISCOVERY_SPARK_VERSION: "7.3.x-scala2.12" DATABRICKS_DISCOVERY_INSTANCE_PROFILE: "arn:aws:iam::<ACCOUNT_ID>:instance-profile/<DATABRICKS_CLUSTER_IAM_ROLE>" DISCOVERY_AWS_CLOUD_ASSUME_ROLE: "true" DISCOVERY_AWS_CLOUD_ASSUME_ROLE_ARN: "arn:aws:iam::<ACCOUNT_ID>:role/<DISCOVERY_IAM_ROLE>"
Azure
> DATABRICKS_DRIVER_INSTANCE_TYPE: "Standard_DS3_v2" DATABRICKS_INSTANCE_TYPE: "Standard_DS3_v2" DATABRICKS_DISCOVERY_MANAGE_INIT_SCRIPT: "true" DATABRICKS_DISCOVERY_SPARK_VERSION: "7.3.x-scala2.12"
Note
PRIVACERA_DISCOVERY_DATABRICKS_DOWNLOAD_URL is no longer in use. The Discovery Databricks packages will be downloaded from PRIVACERA_BASE_DOWNLOAD_URL.
Configuration properties
Property | Description | Example |
---|---|---|
DATABRICKS_DRIVER_INSTANCE_TYPE | For AWS driver's instance type can be "m5.xlarge" or "m5.2xlarge" For Azure driver's instance type can be "Standard_DS3_v2" | m5.xlarge |
DATABRICKS_INSTANCE_TYPE | For AWS driver's instance type can be "m5.xlarge" or "m5.2xlarge" For Azure driver's instance type can be "Standard_DS3_v2" | m5.xlarge |
SETUP_DATABRICKS_JAR | ||
USE_DATABRICKS_SPARK | ||
DATABRICKS_ELASTIC_DISK | ||
DATABRICKS_DISCOVERY_MANAGE_INIT_SCRIPT | Set to true if you want to create databricks init script. | false |
DATABRICKS_DISCOVERY_WORKERS | ||
DATABRICKS_DISCOVERY_JOB_NAME | ||
DATABRICKS_DISCOVERY_SPARK_VERSION | Spark version can be as follows:
| 7.3.x-scala2.12 |
DATABRICKS_DISCOVERY_INSTANCE_PROFILE | Property is used for the instance role, for the Databricks instance node where your discovery will be running | arn:aws:iam::1234564835:instance-profile/privacera_databricks_cluster_iam_role |
DISCOVERY_AWS_CLOUD_ASSUME_ROLE | Property to grant Discovery access to AWS services to perform the scanning operation. | true |
DISCOVERY_AWS_CLOUD_ASSUME_ROLE_ARN | ARN of the AWS IAM Role | arn:aws:iam::12345671758:role/DiscoveryCrossAccAssumeRole_k |
Discovery in AWS
Discovery
This topic allows you to set up the AWS configuration for installing Privacera Discovery in a Docker and Kubernetes (EKS) environment.
IAM policies
To use the Privacera Discovery service, ensure the following IAM policies are attached to the Privacera_PM_Role role to access the AWS services.
Policy to create AWS resourcesPolicy to create AWS resources is required only during installation or when Discovery is updated through Privacera Manager. This policy gives permissions to Privacera Manager to create AWS resources like DynamoDB, Kinesis, SQS, and S3 using terraform.
${AWS_REGION}: AWS region where the resources will get created.
{ "Version":"2012-10-17", "Statement":[ { "Sid":"CreateDynamodb", "Effect":"Allow", "Action":[ "dynamodb:CreateTable", "dynamodb:DescribeTable", "dynamodb:ListTables", "dynamodb:TagResource", "dynamodb:UntagResource", "dynamodb:UpdateTable", "dynamodb:UpdateTableReplicaAutoScaling", "dynamodb:UpdateTimeToLive", "dynamodb:DescribeTimeToLive", "dynamodb:ListTagsOfResource", "dynamodb:DescribeContinuousBackups" ], "Resource":"arn:aws:dynamodb:${AWS_REGION}:*:table/privacera*" }, { "Sid":"CreateKinesis", "Effect":"Allow", "Action":[ "kinesis:CreateStream", "kinesis:ListStreams", "kinesis:UpdateShardCount" ], "Resource":"arn:aws:kinesis:${AWS_REGION}:*:stream/privacera*" }, { "Sid":"CreateS3Bucket", "Effect":"Allow", "Action":[ "s3:CreateBucket", "s3:ListAllMyBuckets", "s3:GetBucketLocation" ], "Resource":[ "arn:aws:s3:::*" ] }, { "Sid":"CreateSQSMessages", "Effect":"Allow", "Action":[ "sqs:CreateQueue", "sqs:ListQueues" ], "Resource":[ "arn:aws:sqs:${AWS_REGION}:${ACCOUNNT_ID}:privacera*" ] } ] }
CLI configuration
SSH to the instance where Privacera is installed.
Configure your environment.
Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)
For a Docker environment, you can skip this step.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.discovery.aws.yml config/custom-vars/ vi config/custom-vars/vars.discovery.aws.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
DISCOVERY_BUCKET_NAME: "<PLEASE_CHANGE>"
To configure a bucket, add the property as follows, where
bucket-1
is the name of the bucket:DISCOVERY_BUCKET_NAME: "bucket-1"
To configure a bucket containing a folder, add the property as follows:
DISCOVERY_BUCKET_NAME: "bucket-1/folder1"
Uncomment/Add the following variable to enable Autoscalability of Executor pods:
DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_ENABLED: "true"
(Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.
For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
DISCOVERY_BUCKET_NAME | Set the bucket name where Discovery will store its metadata files | container1 |
[Properties of Topic and Table names](../pm-ig/customize_topic_and_tables_names.md) | Topic and Table names are assigned by default in Privacera Discovery. To customize any topic or table name, refer to the link. |
Enable realtime scan
An AWS SQS queue is required, if you want to enable realtime scan on the S3 bucket.
After running the PM update command, an SQS queue will be created for you automatically with the name, privacera_bucket_sqs_{{DEPLOYMENT_ENV_NAME}}
, where {{DEPLOYMENT_ENV_NAME}}
is the environment name you set in the vars.privacera.yml
file. This queue name will appear in the list of queues of your AWS SQS account.
If you have an SQS queue which you want to use, add the DISCOVERY_BUCKET_SQS_NAME
property in the vars.discovery.aws.yml
file and assign your SQS queue name.
If you want to enable realtime scan on the bucket, click here.
Discovery in Azure
Azure Discovery
This topic allows you to setup the Azure configuration for installing Privacera Discovery.
Prerequisites
Ensure the following prerequisites are met:
Azure storage account
Create an Azure storage account. For more information, refer to Microsoft's documentation Create a storage account.
Create a private-access container. For more information, refer to Microsoft's documentation Create a container.
Get the access key. For more information, refer to Microsoft's documentation View account access keys .
Azure Cosmos DB account
Create an Azure Cosmos DB, For more information, refer to Microsoft's documentation Cosmos DB .
Get the URI from the Overview section.
Get the Primary Key from the Settings > Keys section.
Set the consistency to Strong in the Settings > Default Consistency section.
For Terraform
Assign permissions to create Azure resources using managed-identity. For more information, refer to Create Azure Resources .
CLI configuration
SSH to the instance where Privacera is installed.
Configure your environment.
Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)
For a Docker environment, you can skip this step.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.kafka.yml config/custom-vars vi config/custom-vars/vars.kafka.yml
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.discovery.azure.yml config/custom-vars vi config/custom-vars/vars.discovery.azure.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
DISCOVERY_FS_PREFIX: "<PLEASE_CHANGE>" DISCOVERY_AZURE_STORAGE_ACCOUNT_NAME: <PLEASE_CHANGE>" DISCOVERY_COSMOSDB_URL: <PLEASE_CHANGE>" DISCOVERY_COSMOSDB_KEY: "<PLEASE_CHANGE>" DISCOVERY_AZURE_STORAGE_ACCOUNT_KEY: "<PLEASE_CHANGE>" CREATE_AZURE_RESOURCES: "false" DISCOVERY_AZURE_RESOURCE_GROUP: "<PLEASE_CHANGE>" DISCOVERY_AZURE_COSMOS_DB_ACCOUNT: "<PLEASE_CHANGE>" DISCOVERY_AZURE_LOCATION: "<PLEASE_CHANGE>"
(Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.
For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.
To configure real-time scan for audits, refer to Pkafka.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
DISCOVERY_ENABLE | In the **Basic** tab, enable/disable Privacera Discovery. | |
DISCOVERY_REALTIME_ENABLE | In the **Basic** tab, enable/disable real-time scan in Privacera Discovery. For real-time scan to work, ensure the following:
| |
DISCOVERY_FS_PREFIX | Enter the container name. Get it from the Prerequisites section. | container1 |
DISCOVERY_AZURE_STORAGE_ACCOUNT_NAME | Enter the name of the Azure Storage account. Get it from the Prerequisites section. | azurestorage |
DISCOVERY_COSMOSDB_URL DISCOVERY_COSMOSDB_KEY | Enter the Cosmos DB URL and Primary Key. Get it from the Prerequisites section. | DISCOVERY_COSMOSDB_URL: "https://url1.documents.azure.com:443/" DISCOVERY_COSMOSDB_KEY: "xavosdocof" |
DISCOVERY_AZURE_STORAGE_ACCOUNT_KEY | Enter the Access Key of the storage account. Get it from the Prerequisites section. | GMi0xftgifp== |
[Properties of Topic and Table names](../pm-ig/customize_topic_and_tables_names.md) | Topic and Table names are assigned by default in Privacera Discovery. To customize any topic or table name, refer to the link. | |
PKAFKA_EVENT_HUB | In the **Advanced > Pkafka Configuration** section, enter the Event Hub name. Get it from the Prerequisites section. | eventhub1 |
PKAFKA_EVENT_HUB_NAMESPACE | In the **Advanced > Pkafka Configuration** section, enter the name of the Event Hub namespace. Get it from the Prerequisites section. | eventhubnamespace1 |
PKAFKA_EVENT_HUB_CONSUMER_GROUP | In the **Advanced > Pkafka Configuration** section, enter the name of the Consumer Group. Get it from the Prerequisites section. | congroup1 |
PKAFKA_EVENT_HUB_CONNECTION_STRING | In the **Advanced > Pkafka Configuration** section, enter the connection string. Get it from the Prerequisites section. | Endpoint=sb://eventhub1.servicebus.windows.net/; SharedAccessKeyName=RootManageSharedAccessKey; SharedAccessKey=sAmPLEP/8PytEsT= |
CREATE_AZURE_RESOURCES | For terraform usage, assign the value as true. Its default value is false. | true |
DISCOVERY_AZURE_RESOURCE_GROUP | Get the value from the Prerequisite section. | resource1 |
DISCOVERY_AZURE_COSMOS_DB_ACCOUNT | Get the value from the Prerequisite section. | database1 |
Discovery in GCP
Discovery
This topic allows you to set up the GCP configuration for installing Privacera Discovery in a Docker and Kubernetes environment.
Prerequisites
Ensure the following prerequisites are met:
Create a service account and add the following roles. For more information, refer to Creating a new service account.
Editor
Owner
Private Logs Viewer
Kubernetes Engine Admin (Required only for a Kubernetes environment)
Create a Bigtable instance and get the Bigtable Instance ID. For more information, refer to Creating a Cloud Bigtable instance.
CLI configuration
SSH to the instance where Privacera is installed.
Configure your environment.
Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)
For a Docker environment, you can skip this step.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.discovery.gcp.yml config/custom-vars/ vi config/custom-vars/vars.discovery.gcp.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
BIGTABLE_INSTANCE_ID: "<PLEASE_CHANGE>" DISCOVERY_BUCKET_NAME: "<PLEASE_CHANGE>"
(Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.
For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.
For real-time scanning, run the following.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.pkafka.gcp.yml config/custom-vars/
Note
Recommended: Use Google Sink based approach to enable real-time scan of applications on different projects, click here.
Optional: Use Google Logging API based approach to enable real-time scan of applications on different projects, click here.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
BIGTABLE_INSTANCE_ID | Get the value by navigating to **Navigation Menu->Databases->BigTable->Check the instance id column**. | BIGTABLE_INSTANCE_ID: "table_1" |
DISCOVERY_BUCKET_NAME | Give a name where the Discovery will store it's metadata files. | DISCOVERY_BUCKET_NAME="bucket_1" |
Pkafka
This topic allows you to enable Pkafka for real-time audits in Privacera Discovery.
Prerequisites
Ensure the following prerequisites are met:
Create an Event Hub namespace with a region similar to the region of a Storage Account you want to monitor. For more information, refer to Microsoft's documentation Create an Event Hubs namespace.
Create Event Hub in the Event Hub namespace. For more information, refer to Microsoft's documentation Create an event hub.
Create a consumer group in the Event Hub.
Azure Portal > Event Hubs namespace > Event Hub > Consumer Groups > +Consumer Group. The Consumer Groups tab will be under Entities of the Event Hub page.
Get the connection string of the Event Hubs namespace. For more information, refer to Microsoft's documentation Get connection string from the portal.
Create an Event Subscription for the Event Hubs namespace with the Event Type as Blob Created and Blob Deleted. For more information, refer to Microsoft's documentation Create an Event Grid subscription.
Note
When you create an event grid subscription, clear the checkbox Enable subject filtering.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.pkafka.azure.yml config/custom-vars/ vi config/custom-vars/vars.pkafka.azure.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
PKAFKA_EVENT_HUB: "<PLEASE_CHANGE>" PKAFKA_EVENT_HUB_NAMESPACE: "<PLEASE_CHANGE>" PKAFKA_EVENT_HUB_CONSUMER_GROUP: "<PLEASE_CHANGE>" PKAFKA_EVENT_HUB_CONNECTION_STRING: "<PLEASE_CHANGE>" DISCOVERY_REALTIME_ENABLE: "true"
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
| Enter the Event Hub name. Get it from the Prerequisites section above. | eventhub1 |
| Enter the name of the Event Hub namespace. Get it from the Prerequisites section above. | eventhubnamespace1 |
| Enter the name of the Consumer Group. Get it from the Prerequisites section above. | congroup1 |
| Enter the connection string. Get it from the Prerequisites section above. | Endpoint=sb://eventhub1.servicebus.windows.net/; SharedAccessKeyName=RootManageSharedAccessKey; SharedAccessKey=sAmPLEP/8PytEsT= |
| Add this property to enable/disable real-time scan. By default, it is set to false. Note: This is a custom property, and has to be added separately to the YAML file. For real-time scan to work, ensure the following:
| true |
Portal SSO with PingFederate
Privacera portal leverages PingIdentity’s Platform Portal for authentication via SAML. For this integration, there are configuration steps in both Privacera portal and PingIdentity.
Configuration steps for PingIdentity
Sign in to your PingIdentity account.
Under Your Environments , click Administrators.
Select Connections from the left menu.
In the Applications section, click on the + button to add a new application.
Enter an Application Name (such as Privacera Portal SAML) and provide a description (optionally add an icon). For the Application Type, select SAML Application. Then click Configure.
On the SAML Configuration page, under "Provide Application Metadata", select Manually Enter.
Enter the ACS URLs:
https://<portal_hostname>:<PORT>/saml/SSO
Enter the Entity ID:
privacera-portal
Click the Save button.
On the Overview page for the new application, click on the Attributes edit button. Add the attribute mapping:
user.login: Username
Set as Required.
Note
If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.
You can optionally add additional attribute mappings:
user.email: Email Address user.firstName: Given Name user.lastName: Family Name
Click the Save button.
Next in your application, select Configuration and then the edit icon.
Set the SLO Endpoint:
https://<portal_hostname>:<PORT>/login.html
Click the Save button.
In the Configuration section, under Connection Details, click on Download Metadata button.
Once this file is downloaded, rename it to:
privacera-portal-aad-saml.xml
This file will be used in the Privacera Portal configuration.
Configuration steps in Privacera Portal
Now we will configure Privacera Portal using privacera-manager to use the privacera-portal-aad-saml.xml file created in the above steps.
Run the following commands:
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
Edit the vars.portal.saml.aad.yml file:
vi config/custom-vars/vars.portal.saml.aad.yml
Add the following properties:
SAML_ENTITY_ID: "privacera-portal" SAML_BASE_URL: "https://{{app_hostname}}:{port}" PORTAL_UI_SSO_ENABLE: "true" PORTAL_UI_SSO_URL: "saml/login" PORTAL_UI_SSO_BUTTON_LABEL: "Single Sign On" AAD_SSO_ENABLE: "true"
Copy the privacera-portal-aad-saml.xml file to the following folder:
~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
Edit the vars.portal.yml file:
cd ~/privacera/privacera-manager/ vi config/custom-vars/vars.portal.yml
Add the following properties and assign your values.
SAML_EMAIL_ATTRIBUTE: "user.email" SAML_USERNAME_ATTRIBUTE: "user.login" SAML_LASTNAME_ATTRIBUTE: "user.lastName" SAML_FIRSTNAME_ATTRIBUTE: "user.firstName"
Run the following to update
privacera-manager
:cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
You should now be able to use Single Sign-on to Privacera using PingFederate.
Encryption & Masking
Privacera Encryption Gateway (PEG) and Cryptography with Ranger KMS
This topic covers how you can set up and use Privacera Cryptography and Privacera Encryption Gateway (PEG) using Ranger KMS.
CLI configuration
SSH to the instance where Privacera is installed.
Create a 'crypto' configuration file, and set the value of the Ranger KMS Master Key Password.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.crypto.yml config/custom-vars/ vi config/custom-vars/vars.crypto.yml
Assign a password to the RANGER_KMS_MASTER_KEY_PASSWORD such as "Str0ngP@ssw0rd".
RANGER_KMS_MASTER_KEY_PASSWORD: "<PLEASE_CHANGE>"
Run the following command.
cp config/sample-vars/vars.peg.yml config/custom-vars/
(Optional) If you want to customize PEG configuration further, you can add custom PEG properties. For more information, refer to PEG Custom Properties.
For example, by default, the username and password for the PEG service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.
Run Privacera Manager to update the Privacera Platform configuration:
cd ~/privacera/privacera-manager ./privacera-manager.sh update
If this is a Kubernetes deployment, update all Privacera services:
./privacera-manager.sh update
AWS S3 bucket encryption
You can set up server-side encryption for AWS S3 bucket to encrypt the resources in the bucket. Supported encryption types are Amazon S3 (SSE-S3), AWS Key Management Service (SSE-KMS), and Customer-Provided Keys (SSE-C). Encryption key is mandatory for the encryption type SSE-C and optional for SSE-KMS. No encryption key is required for SSE-S3. For more information, see Protecting data using server-side encryption in the AWS documentation.
Configure bucket encryption in dataserver
SSH to EC2 instance where Privacera Dataserver is installed.
Enable use of bucket encryption configuration in Privacera Dataserver.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.dataserver.aws.yml config/custom-vars/ vi config/custom-vars/vars.dataserver.aws.yml
Add the new property.
DATA_SERVER_AWS_S3_ENCRYPTION_ENABLE:"true"DATA_SERVER_AWS_S3_ENCRYPTION_MAPPING:-"bucketA|<encryption-type>|<base64encodedssekey>"-"bucketB*,BucketC|<encryption-type>|<base64encodedssekey>"
Property
Description
DATA_SERVER_AWS_S3_ENCRYPTION_ENABLE
Property to enable or disable the AWS S3 bucket encryption support.
DATA_SERVER_AWS_S3_ENCRYPTION_MAPPING
Property to set the mapping of S3 buckets, encryption SSE type, and SSE key (base64 encoded ). For example,
"bucketC*,BucketD|SSE-KMS|<base64 encoded sse key>"
.The base64-encoded encryption key should be set for the following: 1) Encryption type is set to
SSE-KMS
and customer managed CMKs is used for encryption. 2) Encryption type is set toSSE-C
.
Server-Side encryption with Amazon S3-Managed Keys (SSE-S3)
Supported S3 APIs for SSE-S3 Encryption:
PUT Object
PUT Object - Copy
POST Object
Initiate Multipart Upload
Bucket policy
{"Version":"2012-10-17","Id":"PutObjectPolicy","Statement":[{"Sid":"DenyIncorrectEncryptionHeader","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-s3-encrypted-bucket}}/*","Condition":{"StringNotEquals":{"s3:x-amz-server-side-encryption":"AES256"}}},{"Sid":"DenyUnencryptedObjectUploads","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-s3-encrypted-bucket}}/*","Condition":{"Null":{"s3:x-amz-server-side-encryption":"true"}}}]}
Upload a test file.
aws s3 cp myfile.txt s3://{{sse-s3-encrypted-bucket}}/
Server-Side encryption with CMKs stored in AWS Key Management Service (SSE-KMS)
Supported APIs for SSE-KMS Encryption:
PUT Object
PUT Object - Copy
POST Object
Initiate Multipart Upload
Your IAM role should have kms:Decrypt permission when you upload or download an Amazon S3 object encrypted with an AWS KMS CMK. This is in addition to the kms:ReEncrypt, kms:GenerateDataKey, and kms:DescribeKey permissions.
AWS Managed CMKs (SSE-KMS)
Bucket Policy
{"Version":"2012-10-17","Id":"PutObjectPolicy","Statement":[{"Sid":"DenyIncorrectEncryptionHeader","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"StringNotEquals":{"s3:x-amz-server-side-encryption":"aws:kms"}}},{"Sid":"DenyUnencryptedObjectUploads","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"Null":{"s3:x-amz-server-side-encryption":"true"}}}]}
Upload a test file.
aws s3 cp myfile.txt s3://{{sse-s3-encrypted-bucket}}/
Customer Managed CMKs (SSE-KMS)
Bucket Policy
{"Version":"2012-10-17","Id":"PutObjectPolicy","Statement":[{"Sid":"DenyIncorrectEncryptionHeader","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"StringNotEquals":{"s3:x-amz-server-side-encryption":"aws:kms"}}},{"Sid":"RequireKMSEncryption","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"StringNotLikeIfExists":{"s3:x-amz-server-side-encryption-aws-kms-key-id":"{{aws-kms-key}}"}}},{"Sid":"DenyUnencryptedObjectUploads","Effect":"Deny","Principal":"*","Action":"s3:PutObject","Resource":"arn:aws:s3:::{{sse-kms-encrypted-bucket}}/*","Condition":{"Null":{"s3:x-amz-server-side-encryption":"true"}}}]}
Upload a test file.
aws s3 cp privacera_aws.sh s3://{{sse-kms-encrypted-bucket}}/
Server-Side encryption with Customer-Provided Keys (SSE-C)
Supported APIs for SSE-C Encryption:
PUT Object
PUT Object - Copy
POST Object
Initiate Multipart Upload
Upload Part
Upload Part - Copy
Complete Multipart Upload
Get Object
Head Object
Update the privacera_aws_config.json file with bucket and SSE-C encryption key.
Run AWS S3 upload.
aws s3 cp myfile.txt s3://{{sse-c-encrypted-bucket}}/
Run head-object.
aws s3api head-object --bucket {{sse-c-encrypted-bucket}} --key myfile.txt
Sample keys:
Key | Value |
---|---|
AES256-bit key | E1AC89EFB167B29ECC15FF75CC5C2C3A |
Base64-encoded encryption key (sseKey) | echo -n "E1AC89EFB167B29ECC15FF75CC5C2C3A" | openssl enc -base64 |
Base64-encoded 128-bit MD5 digest of the encryption key | echo -n "E1AC89EFB167B29ECC15FF75CC5C2C3A" | openssl dgst -md5 -binary | openssl enc -base64 |
Ranger KMS
Integrate with Azure key vault
This topic shows how to configure Ranger Key Management Storage (KMS) system with Azure Key Vault to enable the use of data encryption. The master key for the encryption is created within the KMS and stored in Azure Key Vault. This section describes how to set up the connection from Ranger KMS to the Azure Key Vault to store the master key in the Azure key vault instead of the Ranger database.
Note: You can manually move the Ranger KMS from the Ranger database to the Azure Key Vault. For more information, refer to Migrate Ranger KMS Master Key
Prerequisites
If the authentication is done without SSL enabled, get the Key Vault URL, ClientId and Client Secret by following the steps in this topic, Connect with a Client ID and Client Secret.
If the authentication is done with SSL enabled, get the Key Vault URL, ClientId and Certificate by following the steps in this topic, Connect with a Client ID and Certificate.
Configure Privacera Cryptography with Ranger KMS. For more information, refer to Privacera Cryptography with Ranger KMS.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.crypto.azurekeyvault.yml config/custom-vars/ vi config/custom-vars/vars.crypto.azurekeyvault.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
AZURE_KEYVAULT_SSL_ENABLED: "<PLEASE_CHANGE>" AZURE_KEYVAULT_CLIENT_ID: "<PLEASE_CHANGE>" AZURE_KEYVAULT_CLIENT_SECRET: "<PLEASE_CHANGE>" AZURE_KEYVAULT_CERT_FILE: "<PLEASE_CHANGE>" AZURE_KEYVAULT_CERTIFICATE_PASSWORD: "<PLEASE_CHANGE>" AZURE_KEYVAULT_MASTERKEY_NAME: "<PLEASE_CHANGE>" AZURE_KEYVAULT_MASTER_KEY_TYPE: "<PLEASE_CHANGE>" AZURE_KEYVAULT_ZONE_KEY_ENCRYPTION_ALGO: "<PLEASE_CHANGE>" AZURE_KEYVAULT_URL: "<PLEASE_CHANGE>"
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
| Activate Azure Key Vault. | true |
| Get the ID by following the Pre-requisites section above. | 50fd7ca6-xxxx-xxxx-a13f-1xxxxxxxx |
| Get the client secret by following the Pre-requisites section above. | <AzureKeyVaultPassword> |
| Get the file by following the Pre-requisites section above. Ensure the file is copied in the config/ssl folder, and give it a name. | azure-key-vault.pem |
| Get the value by following the Pre-requisites section above. | certPass |
| Enter the name of the master key. A key with this name will be created in Azure Key Vault. | RangerMasterKey |
| Enter a type of master key. Values: RSA, RSA_HSM, EC, EC_HSM, OCT | RSA |
| Enter an encryption algorithm for the master key. Values: RSA_OAEP, RSA_OAEP_256, RSA1_5, RSA_OAEP | RSA_OAEP |
| Get the URL by following the Pre-requisites section above. | https://keyvault.vault.azure.net/ |
AuthZ / AuthN
LDAP / LDAP-S for Privacera portal access
LDAP / LDAP-S for Privacera Portal access
This configuration sequence configures the Privacera Portal to reference an external LDAP or LDAP over SSL directory for the purpose of Privacera Portal user login authentication.
Prerequisites
Before starting these steps, prepare the following. You need to configure various Privacera properties with these values, as detailed in Configuration.
Determine the following LDAP values:
The FQDN and protocol (http or https) of your LDAP server
Complete Bind DN
Bind DN password
Top-level search base
User search base
Group search base
Username attribute
DN attribute
To configure an SSL-enabled LDAP server, Privacera requires an SSL certificate. You have these alternatives:
Set the Privacera property
PORTAL_LDAP_SSL_ENABLED: "true"
.Allow Privacera Manager to download and create the certificate based on the LDAP server URL. Set the Privacera property
PORTAL_LDAP_SSL_PM_GEN_TS: "true"
.Manually configure a truststore on the Privacera server that contains the certificate of the LDAP server. Set the Privacera property
PORTAL_LDAP_SSL_PM_GEN_TS: "false"
.
CLI configuration
SSH to the instance where Privacera is installed.
Run the commands below.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.portal.ldaps.yml config/custom-vars/ vi config/custom-vars/vars.portal.ldaps.yml
Uncomment the properties and edit the configurations as required. For property details and description, refer to the Configuration Properties below.
PORTAL_LDAP_ENABLE: "true" PORTAL_LDAP_URL: "<PLEASE_CHANGE>" PORTAL_LDAP_BIND_DN: "<PLEASE_CHANGE>" PORTAL_LDAP_BIND_PASSWORD: "<PLEASE_CHANGE>" PORTAL_LDAP_SEARCH_BASE: "<PLEASE_CHANGE>" PORTAL_LDAP_USER_SEARCH_BASE: "<PLEASE_CHANGE>" PORTAL_LDAP_GROUP_SEARCH_BASE: "<PLEASE_CHANGE>" PORTAL_LDAP_USERNAME_ATTRIBUTE: "<PLEASE_CHANGE>" PORTAL_LDAP_DN_ATTRIBUTE: "<PLEASE_CHANGE>" PORTAL_LDAP_BIND_ANONYMOUSLY: "false" PORTAL_LDAP_SSL_ENABLED: "true" PORTAL_LDAP_SSL_PM_GEN_TS: "true"
Run Privacera Manager update.
>cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
PORTAL_LDAP_URL | Add value as "LDAP_HOST: LDAP_PORT | xxx.example.com:983 |
PORTAL_LDAP_BIND_DN | CN=Bind User,OU=example,DC=ad,DC=example,DC=com | |
PORTAL_LDAP_BIND_PASSWORD | Add the password for LDAP | |
PORTAL_LDAP_SEARCH_BASE | ou=example,dc=ad,dc=example,dc=com | |
PORTAL_LDAP_USER_SEARCH_BASE | ou=example,dc=ad,dc=example,dc=com | |
PORTAL_LDAP_GROUP_SEARCH_BASE | OU=example_services,OU=example,DC=ad,DC=example,DC=com | |
PORTAL_LDAP_USERNAME_ATTRIBUTE | sAMAccountName | |
PORTAL_LDAP_DN_ATTRIBUTE | PORTAL_LDAP_DN_ATTRIBUTE: dc | |
PORTAL_LDAP_SSL_ENABLED | For SSL enabled LDAP server, set this value to true. | true |
PORTAL_LDAP_SSL_PM_GEN_TS | Set this to true if you want Privacera Manager to generate the truststore for your ldaps server. Set this to false if you want to manually provide the truststore certificate. To learn how to upload SSL certificates, [click here](../pm-ig/upload_custom_cert.md). | true |
Map LDAP roles with the existing Privacera roles
You can associate LDAP users roles to Privacera roles using Privacera LDAP Role Mapping. It allows you to use the access control of Privacera Portal with LDAP user roles.
Log in to Privacera Portal using padmin user credentials or as a user with Privacera ROLE_SYSADMIN role.
Go to Settings > System Configurations.
Select Custom Properties checkbox.
Click on Add Property and enter the new property, auth.ldap.enabled=true.
Click Save.
Go to Settings > LDAP Role Mapping.
Add the appropriate role mappings.
When you login in back with LDAP user, you will see the new user role. This LDAP user login can be done after the LDAP setup with Privacera Manager is completed.
Portal SSO with AAD using SAML
Privacera supports SAML that allows you to authenticate users using single-sign on (SSO) technology. It is way to provide access to use Privacera services.
Using the Azure Active Directory (AAD) SAML Toolkit, you can set up single sign-on (SSO) in Privacera Manager for Active Directory users. After setting up the SSO, you will be provided with an SSO button on the login page of Privacera Portal.
Prerequisites
To configure SSO with Azure Active Directory, you need to configure and enable SSL for the Privacera Portal. See Enable CA Signed Certificates or Enable Self Signed Certificates.
Configuring SAML in Azure AD
The following steps describe how to configure SAML in Azure AD application:
Log in to Azure portal.
On the left navigation pane, select the Azure Active Directory service.
Navigate to Enterprise Applications and then select All Applications.
To add a new application, select New application.
Note
If you have an existing Azure AD SAML Toolkit application, select it, and then go to step 8 to continue with the rest of the configuration.
in the search box.Azure AD SAML Toolkit In the Add from the gallery section, type Do the following:
Select Azure AD SAML Toolkit from the results panel and then add the app.
On the Azure AD SAML Toolkit application integration page, in the Manage section and select single sign-on.
On the Select a single sign-on method page, select SAML.
Click the pen icon for Basic SAML Configuration to edit the settings.
On the Basic SAML Configuration page, enter the values for the following fields, and then click Save. You can assign a unique name for the Entity ID.
Entity ID = privacera-portal
Reply URL = https://${APP_HOSTNAME}:6868/saml/SSO
Sign-on URL = https://${APP_HOSTNAME}:6868/login.html
In the SAML Signing Certificate section, find Federation Metadata XML and select Download to download the certificate and save it on your virtual machine.
On the Set up Azure AD SAML Toolkit section, copy the Azure AD Identifier URL.
In the Manage section and select Users and groups.
In the Users and groups dialog, select the user or user group who should be allowed to log in with SSO, then click the Select.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following command:
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
Edit the
vars.portal.saml.aad.yml
file.vi config/custom-vars/vars.portal.saml.aad.yml
Modify the SAML_ENTITY_ID. You need to assign the value of the Entity ID achieved in the above section. For property details and description, refer to the Configuration Properties below.
SAML_ENTITY_ID: "privacera-portal" SAML_BASE_URL: "https://{{app_hostname}}:6868" PORTAL_UI_SSO_ENABLE: "true" PORTAL_UI_SSO_URL: "saml/login" PORTAL_UI_SSO_BUTTON_LABEL: "Azure AD Login" AAD_SSO_ENABLE: "true"
Rename the downloaded Federation Metadata XML file as
privacera-portal-aad-saml.xml.
Copy this file to the~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
folder.Run the following command:
cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
If you are configuring the SSL in an Azure Kubernetes environment, then run the following command.
./privacera-manager.sh restart portal
Configuration properties
Property | Description | Example |
---|---|---|
AAD_SSO_ENABLE | Enabled by default. | |
SAML_ENTITY_ID | Get the value from the Prerequisites section. | privacera-portal |
SAML_BASE_URL | https://{{app_hostname}}:6868 | |
PORTAL_UI_SSO_BUTTON_LABEL | Azure AD Login | |
PORTAL_UI_SSO_URL | saml/login | |
SAML_GLOBAL_LOGOUT | Enabled by default. The global logout for SAML is enabled. Once a logout is initiated, all the sessions you've accessed from the browser would be terminated from the Identity Provider (IDP). | |
META_DATA_XML | Browse and select the Federation Metadata XML, which you downloaded in the Prerequisites section. |
Validation
Go to the login page of the Privacera Portal. You will see the Azure AD Login button.
Configure SAML assertion attributes
By default, the following assertion attributes are configured with pre-defined values:
Email
Username
Firstname
Lastname
You can customize the values for the assertion attributes. To do that, do the following:
Run the following commands.
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.yml config/custom-vars/ vi config/custom-vars/vars.portal.yml
Add the following properties and assign your values. For more information on custom properties and its values, click here.
SAML_EMAIL_ATTRIBUTE: "" SAML_USERNAME_ATTRIBUTE: "" SAML_LASTNAME_ATTRIBUTE: "" SAML_FIRSTNAME_ATTRIBUTE: ""
Add the properties in the YAML file configured in the Configuration above.
cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
Portal SSO with Okta using SAML
Okta is a third-party identity provider, offering single sign-on (SSO) authentication and identity validation services for a large number of Software-as-a-Service providers. PrivaceraCloud works with Okta's SAML (Security Assertion Markup Language) interface to provide an SSO/Okta login authentication to the Privacera portal. For more information, see CLI configuration.
Integration with Okta begins with configuration steps in the Okta administrator console. These steps also generate a Privacera portal account-specific identity_provider_metadata.xml
file and an Identity Provider URL
that are used in the Privacera CLI configuration steps.
Prerequisites
To configure SSO with Okta , you need to configure and enable SSL for the Privacera Portal. See Enable CA Signed Certificates or Enable Self Signed Certificates.
Note
To use Okta SSO with Privacera portal, you must have already established an Okta SSO service account. The following procedures require Okta SSO administrative login credentials.
Generate an Okta Identity Provider Metadata File and URL
Log in to your Okta account as the Okta SSO account administrator.
Select Applications from the left navigation panel, then click Applications subcategory.
From the Applications page, click Create App Integration.
Note
In addition to creating new applications you can also edit existing apps with new configuration values.
Select SAML 2.0, then click Next.
In General Settings, provide a short descriptive app name in the App name text box. For example, enter Privacera Portal SAML.
Click Next.
In the SAML Settings configuration page, enter the values as shown in the following table:
Field
Value
Single sign on URL
http://portal_hostname:6868/saml/SSO
Audience URI (SP Entity ID)
privacera_portal
Default RelayState
The value identifies a specific application resource in an IDP initiated SSO scenario. In most cases this field will be left blank.
Name ID format
Unspecified
Application username
Okta username
UserID
user.login
Email
user.email
Firstname
user.firstName
LastName
user.LastName
Note
If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.
Click Next.
Select the Feedback tab and click I'm an Okta customer adding an internal app.
Click Finish.
From the General tab, scroll down to the App Embed Link section. Copy the Embed Link (Identity Provider URL) for PrivaceraCloud.
IdP provider metadata
In this topic, you will learn how to generate and save IdP provider metadata in XML format.
Go to Sign On tab.
> Settings, select the Identity Provider Metadata link located at the bottom of the Sign on methods area. The configuration file will open in a separate window.
In the SAML Signing Certificates section, click the Generate new certificate button.
In the list, click the Actions dropdown and select View IdP metadata.
The XML file will be opened in a new tab.
Note
Make sure that the certificate you are downloading has an active status.
Save the file in XML format.
Idp initiated SSO
From Applications, login to the Okta Home Page Dashboard as a user by selecting the Okta Dashboard icon.
Login to the Privacera Portal by selecting the newly added app icon.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following command:
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
Edit the
vars.portal.saml.aad.yml
file.vi config/custom-vars/vars.portal.saml.aad.yml
Modify the SAML_ENTITY_ID. You need to assign the value of the Entity ID achieved in the above section. For property details and description, refer to the Configuration Properties below.
SAML_ENTITY_ID: "privacera-portal" SAML_BASE_URL: "https://{{app_hostname}}:6868" PORTAL_UI_SSO_ENABLE: "true" PORTAL_UI_SSO_URL: "saml/login" PORTAL_UI_SSO_BUTTON_LABEL: "Azure AD Login" AAD_SSO_ENABLE: "true"
Rename the downloaded Federation Metadata XML file as
privacera-portal-aad-saml.xml.
Copy this file to the~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
folder.Run the following command:
cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
If you are configuring the SSL in an Azure Kubernetes environment, then run the following command.
./privacera-manager.sh restart portal
Configuration properties
Property | Description | Example |
---|---|---|
AAD_SSO_ENABLE | Enabled by default. | |
SAML_ENTITY_ID | Get the value from the Prerequisites section. | privacera-portal |
SAML_BASE_URL | https://{{app_hostname}}:6868 | |
PORTAL_UI_SSO_BUTTON_LABEL | Azure AD Login | |
PORTAL_UI_SSO_URL | saml/login | |
SAML_GLOBAL_LOGOUT | Enabled by default. The global logout for SAML is enabled. Once a logout is initiated, all the sessions you've accessed from the browser would be terminated from the Identity Provider (IDP). | |
META_DATA_XML | Browse and select the Federation Metadata XML, which you downloaded in the Prerequisites section. |
Validation
Go to the login page of the Privacera Portal. You will see the Okta Login button.
Configure SAML assertion attributes
By default, the following assertion attributes are configured with pre-defined values:
Email
Username
Firstname
Lastname
You can customize the values for the assertion attributes. To do that, do the following:
Run the following commands.
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.yml config/custom-vars/ vi config/custom-vars/vars.portal.yml
Add the following properties and assign your values. For more information on custom properties and its values, click here.
SAML_EMAIL_ATTRIBUTE: "" SAML_USERNAME_ATTRIBUTE: "" SAML_LASTNAME_ATTRIBUTE: "" SAML_FIRSTNAME_ATTRIBUTE: ""
Add the properties in the YAML file configured in the Configuration above.
cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
Portal SSO with Okta using OAuth
This topic covers how you can integrate Okta SSO with Privacera Portal using Privacera Manager. Privacera Portal supports Okta as a login provider using OpenId or OAuth or SAML. For more information about SAML configuration, see Portal SSO with Okta using SAML).
Prerequisites
Before you begin, ensure the following prerequisites are met:
Setup an Okta Authorization and get the values for the following to use them in the Configuration section below.
authorization_endpoint
token_endpoint
Client ID
Client Secret
User Info URI
CLI configuration
SSH to the instance where Privacera is installed.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.okta.yml config/custom-vars/ vi config/custom-vars/vars.okta.yml
Edit the values for the following. For property details and description, refer to the Configuration Properties below.
OAUTH_CLIENT_CLIENTSECRET : "<PLEASE_CHANGE>" OAUTH_CLIENT_CLIENTID : "<PLEASE_CHANGE>" OAUTH_CLIENT_TOKEN_URI : "<PLEASE_CHANGE>" OAUTH_CLIENT_AUTH_URI : "<PLEASE_CHANGE>" OAUTH_RESOURCE_USER_INFO_URI : "<PLEASE_CHANGE>" PORTAL_UI_SSO_ENABLE: "true"
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
| Get it from the Prerequisites section above. | OAUTH_CLIENT_CLIENTSECRET: "4hb88P9UZmxxxxxxxxm1WtqsaQRv1FZDZiaOT0Gm" |
| Get it from the Prerequisites section above. | 0oa63edjkaoNHGYTS357 |
| Get it from the Prerequisites section above. | https://dev-396511.okta.com/oauth2/default/v1/token |
| Get it from the Prerequisites section above. | https://dev-396511.okta.com/oauth2/default/v1/authorize |
| Get it from the Prerequisites section above. | https://dev-396511.okta.com/oauth2/default/v1/userinfo |
| Property to enable/disable OKTA | true |
Validation
Login to Privacera Portal using Okta SSO Login
Log in to Privacera Portal.
Click SSO Login button.
The Okta login page is displayed.
Enter the Okta user login credentials. The Privacera Portal page is displayed.
Login to Privacera Portal using Privacera user credentials
Log in to Privacera Portal.
Enter the user credentials (padmin).
Click Login button. The Privacera Portal page is displayed.
Portal SSO with PingFederate
Privacera portal leverages PingIdentity’s Platform Portal for authentication via SAML. For this integration, there are configuration steps in both Privacera portal and PingIdentity.
Configuration steps for PingIdentity
Sign in to your PingIdentity account.
Under Your Environments , click Administrators.
Select Connections from the left menu.
In the Applications section, click on the + button to add a new application.
Enter an Application Name (such as Privacera Portal SAML) and provide a description (optionally add an icon). For the Application Type, select SAML Application. Then click Configure.
On the SAML Configuration page, under "Provide Application Metadata", select Manually Enter.
Enter the ACS URLs:
https://<portal_hostname>:<PORT>/saml/SSO
Enter the Entity ID:
privacera-portal
Click the Save button.
On the Overview page for the new application, click on the Attributes edit button. Add the attribute mapping:
user.login: Username
Set as Required.
Note
If user’s login id is is not the same as the username, for example if user login id is email, this attribute will be considered as username in the portal. The username value would be email with the domain name (@gmail.com) removed. For example "john.joe@company.com", the username would be "john.joe". If there is another attribute which can be used as the username then this value will hold that attribute.
You can optionally add additional attribute mappings:
user.email: Email Address user.firstName: Given Name user.lastName: Family Name
Click the Save button.
Next in your application, select Configuration and then the edit icon.
Set the SLO Endpoint:
https://<portal_hostname>:<PORT>/login.html
Click the Save button.
In the Configuration section, under Connection Details, click on Download Metadata button.
Once this file is downloaded, rename it to:
privacera-portal-aad-saml.xml
This file will be used in the Privacera Portal configuration.
Configuration steps in Privacera Portal
Now we will configure Privacera Portal using privacera-manager to use the privacera-portal-aad-saml.xml file created in the above steps.
Run the following commands:
cd ~/privacera/privacera-manager/ cp config/sample-vars/vars.portal.saml.aad.yml config/custom-vars/
Edit the vars.portal.saml.aad.yml file:
vi config/custom-vars/vars.portal.saml.aad.yml
Add the following properties:
SAML_ENTITY_ID: "privacera-portal" SAML_BASE_URL: "https://{{app_hostname}}:{port}" PORTAL_UI_SSO_ENABLE: "true" PORTAL_UI_SSO_URL: "saml/login" PORTAL_UI_SSO_BUTTON_LABEL: "Single Sign On" AAD_SSO_ENABLE: "true"
Copy the privacera-portal-aad-saml.xml file to the following folder:
~/privacera/privacera-manager/ansible/privacera-docker/roles/templates/custom
Edit the vars.portal.yml file:
cd ~/privacera/privacera-manager/ vi config/custom-vars/vars.portal.yml
Add the following properties and assign your values.
SAML_EMAIL_ATTRIBUTE: "user.email" SAML_USERNAME_ATTRIBUTE: "user.login" SAML_LASTNAME_ATTRIBUTE: "user.lastName" SAML_FIRSTNAME_ATTRIBUTE: "user.firstName"
Run the following to update
privacera-manager
:cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
You should now be able to use Single Sign-on to Privacera using PingFederate.
JSON Web Tokens (JWT)
This topic shows how to authenticate Privacera services using JSON web tokens (JWT).
Supported services:
Open Spark plugin (OLAC/FGAC)
Dataserver API to generate signature for spark OLAC plugin
Prerequisites
Ensure the following prerequisites are met:
Get the identity provider URL that is allowed in the issuer claim of a JWT.
Get the public key from the provider that Privacera services can use to validate JWT.
Configuration
SSH to the instance as USER.
Copy the public key in
~/privacera/privacera-manager/config/custom-properties
folder. If you are configuring more than one JWT, then copy all the public keys associated with the JWT tokens to the same path.Run the following commands.
cd ~/privacera/privacera-manager/config cp sample-vars/vars.jwt-auth.yaml custom-vars vi custom-vars/vars.jwt-auth.yaml
Edit the properties.
Table 5. JWT PropertiesProperty
Description
Example
JWT_OAUTH_ENABLE
Property to enable JWT auth in Privacera services.
TRUE
JWT_CONFIGURATION_LIST
Property to set multiple JWT configurations.
issuer: URL of the identity provider.
subject: Subject of the JWT (the user).
secret: If the JWT token has been encrypted using secret.
publickey: JWT file name that you copied in step 2 above.
userKey: Define a unique userkey.
groupKey: Define a unique group key.
parserType: Assign one of the following values.
PING_IDENTITY: When scope/group is array.
KEYCLOAK: When scope/group is space separator.
JWT_CONFIGURATION_LIST: - index: 0 issuer: "https://your-idp-domain.com/websec" subject: "api-token" secret: "tprivacera-api" publickey: "jwttoken.pub" userKey: "client_id" groupKey: "scope" parserType: "KEYCLOAK" - index: 1 issuer: "https://your-idp-domain.com/websec2" publickey: "jwttoken2.pub" parserType: "PING_IDENTITY" - index: 2 issuer: "https://your-idp-domain.com/websec3" publickey: "jwttoken3.pub"
Run the update.
cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
JWT for Databricks
Configure
To configure JWT for Databricks, do the following:
Enable JWT. To enable JWT, refer Configuration.
(Optional) Create a JWT, if you do not have one. Skip this step, if you already have an existing token.
To create a token, see JWT and use the following details. For more details, refer the JWT docs.
Algorithm=RSA256
When JWT_PARSER_TYPE is KEYCLOAKS (scope/group is space separator)
{ "scope": "jwt:role1 jwt:role2", "client_id": "privacera-test-jwt-user", "iss": "privacera","exp": <PLEASE_UPDATE> }
When JWT_PARSER_TYPE is PING_IDENTITY (scope/group is array)
{ "scope": [ "jwt:role1", "jwt:role1" ], "client_id": "privacera-test-jwt-user", "iss": "privacera", "exp": <PLEASE_UPDATE> }
Paste public/private key in input box.
Copy the generated JWT Token.
Log in to Databricks portal and write the following JWT file in a cluster file. Then the Privacera plugin can read and perform access-control based on the token user.
%python JWT_TOKEN="<PLEASE_UPDATE>" TOKEN_LOCAL_FILE="/tmp/ptoken.dat" f = open(TOKEN_LOCAL_FILE, "w") f.write(JWT_TOKEN) f.close()
Use case
Reading files from the cloud using JWT token
Read the files in the file explorer of your cloud provider from your notebook. Depending on your cloud provider, enter the location of your cloud files in the
<path-to-your-cloud-files>
.%python spark.read.csv("<path-to-your-cloud-files>").show()
Check the audits. To learn how to check the audits, click here.
You should get JWT user (privacera-test-jwt-user) which was specified in the payload while creating the JWT.
To give permissions on a resource, create a group in Privacera Portal similar to the scope of the JWT payload and give access to the group, It's not necessary to create a user.
Privacera plugin extracts the JWT payload and passes the group during access check. In other words, it takes user-group mapping from JWT payload itself, so it's not required to do user-group mapping in Privacera.
JWT for EMR FGAC Spark
Configure AWS EMR with Privacera
First enable JWT, see Configuration above.
Open the
vars.emr.yml
file.cd ~/privacera/privacera-managervi config/custom-vars/vars.emr.yml
Add following property to enable JWT for EMR.
EMR_JWT_OAUTH_ENABLE: "true"
Run the update.
cd ~/privacera/privacera-manager/ ./privacera-manager.sh update
Create a JWT, see Step 2 above.
SSH to the EMR master node.
Configure the Spark application as follows:
JWT_TOKEN=eyJhbGciOiJSU-XXXXXX–X2BAIGWTbywHkfTxxw spark-sql --conf "spark.hadoop.privacera.jwt.token.str=${JWT_TOKEN}" --conf "spark.hadoop.privacera.jwt.oauth.enable=true"
Security
Enable self signed certificates with Privacera Platform
This topic provides instructions for use of Self-Signed Certificates with Privacera services including Privacera Portal, Apache Ranger, Apache Ranger KMS, and Privacera Encryption Gateway. It establishes a secure connection between internal Privacera components (Dataserver, Ranger KMS, Discovery, PolicySync, and UserSync) and SSL-enabled servers.
Note
Support Chain SSL - Preview Functionality
Previously Privacera services were only using one SSL certificate of LDAP server even if a chain of certificates was available. Now as a Preview functionality, all the certificates which are available in the chain certificate are imported it into the truststore. This is added for Privacera usersync, Ranger usersync and portal SSL certificates.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following command.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.ssl.yml config/custom-vars/ vi config/custom-vars/vars.ssl.ym
Set the passwords for the following configuration. The passwords must be at least six characters and should include alpha, symbol, numerical characters.
SSL_DEFAULT_PASSWORD: "<PLEASE_CHANGE>" RANGER_PLUGIN_SSL_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" RANGER_PLUGIN_SSL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
Note
You can enable/disable SSL for specific Privacera services. For more information, refer to Configure SSL for Privacera Services.
Run Privacera Manager update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
For Kubernetes based deployments, restart services:
cd ~/privacera/privacera-manager ./privacera-manager.sh restart
Enable CA signed certificates with Privacera Platform
Enable CA signed certificates with Privacera Platform
This topic provides instructions for use of CA Signed Certificates with Privacera services including Privacera Portal, Apache Ranger, Apache Ranger KMS, and Privacera Encryption Gateway. It establishes a secure connection between internal Privacera components (Dataserver, Ranger KMS, Discovery, PolicySync, and UserSync) and SSL-enabled servers.
Certificate Authority (CA) or third-party generated certificates must be created for the specific hostname subdomain.
Privacera supports signed certificates as 'pem' files.
CLI configuration
SSH to the instance where Privacera is installed.
Copy the public (
ssl_cert_full_chain.pem
) and private key (ssl_cert_private_key.pem
) files to the~/privacera/privacera-manager/config/ssl/
location.Create and open the vars.ssl.yml file.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.ssl.yml config/custom-vars/ vi config/custom-vars/vars.ssl.yml
Set values for the following properties:
SSL_SELF_SIGNED: false;
SSL_DEFAULT_PASSWORD (Use a strong password with upper and lower case, symbols, and numbers);
Uncomment Property/Value pairs and set the appropriate value for:
#PRIVACERA_PORTAL_KEYSTORE_ALIAS #PRIVACERA_PORTAL_KEYSTORE_PASSWORD #PRIVACERA_PORTAL_TRUSTSTORE_PASSWORD #RANGER_ADMIN_KEYSTORE_ALIAS #RANGER_ADMIN_KEYSTORE_PASSWORD #RANGER_ADMIN_TRUSTSTORE_PASSWORD #DATASERVER_SSL_TRUSTSTORE_PASSWORD #USERSYNC_AUTH_SSL_TRUSTSTORE_PASSWORD
If KMS is enabled, uncomment, and set the following:
>#RANGER_KMS_KEYSTORE_ALIAS #RANGER_KMS_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" #RANGER_KMS_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
If PEG enabled, uncomment, and set the following:
#PEG_KEYSTORE_ALIAS #PEG_KEYSTORE_PASSWORD #PEG_TRUSTSTORE_PASSWORD SSL_SELF_SIGNED: "false" SSL_DEFAULT_PASSWORD: "<PLEASE_CHANGE>" #SSL_SIGNED_PEM_FULL_CHAIN: "ssl_cert_full_chain.pem" #SSL_SIGNED_PEM_PRIVATE_KEY: "ssl_cert_private_key.pem" SSL_SIGNED_CERT_FORMAT: "pem" #PRIVACERA_PORTAL_KEYSTORE_ALIAS: "<PLEASE_CHANGE>" #PRIVACERA_PORTAL_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" #PRIVACERA_PORTAL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>" #RANGER_ADMIN_KEYSTORE_ALIAS: "<PLEASE_CHANGE>" #RANGER_ADMIN_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" #RANGER_ADMIN_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>" #DATASERVER_SSL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>" #USERSYNC_AUTH_SSL_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>" #Below is need only if you have KMS enabled #RANGER_KMS_KEYSTORE_ALIAS: "<PLEASE_CHANGE>" #RANGER_KMS_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" #RANGER_KMS_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>" #Below is needed only if you have PEG enabled #PEG_KEYSTORE_ALIAS: "<PLEASE_CHANGE>" #PEG_KEYSTORE_PASSWORD: "<PLEASE_CHANGE>" #PEG_TRUSTSTORE_PASSWORD: "<PLEASE_CHANGE>"
Add domain names for the Privacera services. See Add Domain Names for Privacera Service URLs.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
For Kubernetes based deployments, restart services:
cd ~/privacera/privacera-manager ./privacera-manager.sh restart
Add domain names for Privacera service URLs
Note
If you have Nginx ingress enabled in your environment, then the configuration described below would not be required. For more information on Nginx ingress, see Externalize Access to Privacera Services - Nginx Ingress.
You can expose Privacera services such as Portal, Ranger, AuditServer, DataServer and PEG to be accessed externally and configure a domain name to point to them. You can use DNS service to host DNS records needed for them.
Configuration
Create a
vars.service_hostname.yml
file.vi config/custom-vars/vars.service_hostname.yml
Depending on the services you want to expose, add the properties in the file. Replace
<PLEASE_CHANGE>
with a hostname.PORTAL_HOST_NAME:"<PLEASE_CHANGE>"DATASERVER_HOST_NAME:"<PLEASE_CHANGE>"RANGER_HOST_NAME:"<PLEASE_CHANGE>"PEG_HOST_NAME:"<PLEASE_CHANGE>"AUDITSERVER_HOST_NAME:"<PLEASE_CHANGE>"
Create CNAME records to point them to the service load balancer URLs. If you are installing Privacera and its services for the first time, you must complete the installation and then return to this step to create CNAME records.
Run the following command to get the service URL. Replace
<name_space>
with your Kubernetes namespace.kubectl get svc -n <name_space>
To create CNAME records using the service URLs, do the following:
For AWS, refer to Creating records by using the Amazon Route 53 console.
For Azure, refer to Create the CNAME record.
For GCP, refer to Adding a CNAME record.
Run the update.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Enable password encryption for Privacera services
Enable password encryption for Privacera services
This topic covers how you can enable encryption of secrets for Privacera services such as Privacera Portal, Privacera Dataserver, Privacera Ranger, Ranger Usersync, Privacera Discovery, Ranger KMS, Crypto, PEG, and Privacera PolicySync. The passwords will be stored safely in keystores, instead of being exposed in plaintext.
By default, all the sensitive data of the Privacera services are encrypted.
CLI configuration
SSH to the instance where Privacera is installed.
Run the following command.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.encrypt.secrets.yml config/custom-vars/ vi config/custom-vars/vars.encrypt.secrets.yml
In this file set values for the following:
Enter a password for the keystore that will hold all the secrets. e.g. Str0ngP@ssw0rd
GLOBAL_DEFAULT_SECRETS_KEYSTORE_PASSWORD:"<PLEASE_CHANGE>"
If you want to encrypt data of a Privacera service, you can enter the name of the property.
Examples
To encrypt properties used by Privacera Portal:
PORTAL_ADD_ENCRYPT_PROPS_LIST:-PRIVACERA_PORTAL_DATASOURCE_URL-PRIVACERA_PORTAL_DATASOURCE_USERNAME
To encrypt properties used by Dataserver:
DATASERVER_ADD_ENCRYPT_PROPS_LIST:-DATASERVER_MAC_ALGORITHM
To encrypt properties used by Encryption:
>#Additional properties to be encrypted for Crypto CRYPTO_ENCRYPT_PROPS_LIST:-
To
Run the following command.
>./privacera-manager.sh update
For a Kubernetes configuration, you also need to run the following command:
./privacera-manager.sh restart
To check keystores generated for the respective services.
ls ~/privacera/privacera-manager/config/keystores