- Platform Release 6.5
- Privacera Platform Installation
- Privacera Platform User Guide
- Privacera Discovery User Guide
- Privacera Encryption Guide
- Privacera Access Management User Guide
- AWS User Guide
- Overview of Privacera on AWS
- Configure policies for AWS services
- Using Athena with data access server
- Using DynamoDB with data access server
- Databricks access manager policy
- Accessing Kinesis with data access server
- Accessing Firehose with Data Access Server
- EMR user guide
- AWS S3 bucket encryption
- Getting started with Minio
- Plugins
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- Privacera Platform documentation changelog
TagSync using Apache Ranger
Privacera Discovery allows you to classify your data using tags. Tags can be used in access policies to manage access to sensitive data.
Apache Ranger requires the tagged information while applying a policy. This topic describes how you can propagate the tag details from Discovery to Apache Ranger.
Enable TagSync
You need to enable TagSync in the Privacera Portal by configuring the following properties in the Application Properties UI. See General Process for more information.
ranger.writer.enable=true send.inherited.table.tags.to.ranger=true
Properties to add based on service type
Apart from above properties, you need to add the additional properties based on service type in Application Properties UI. These properties will help to verify TagSync in Apache Ranger using the Ranger utility script.
For example:
service_name=privacera_s3 cluster_name=privacera
The value of service_name
depends on the application that you want to apply TagSync to. The following is a list of services and values for each application:
S3
service_name=privacera_s3 cluster_name=privacera
Redshift
service_name=privacera_redshift cluster_name=privacera
PostgreSQL
service_name=privacera_postgres cluster_name=privacera
Snowflake
service_name=privacera_snowflake cluster_name=privacera
DynamoDB
service_name=privacera_dynamodb cluster_name=privacera
MSSQL/Synapse
service_name=privacera_mssql cluster_name=privacera
MySql/MariaDB/AuroraDB/Databricks Spark SQL
service_name=privacera_hive cluster_name=privacera
TagSync validation scenarios
TagSync can be validated in the following scenarios:
Note
Allowed and rejected tags will not be synced to Apache Ranger.
Auto scanning
On the Classifications page, files are classified with system classified tags. After classification, all system-classified and manually accepted tags are synced to Apache Ranger.
Parent-Child Level TagSync in Apache Ranger:
Based on database applications or file systems, the following is the criteria to sync parent and child tags:
Database applications
If the resource is a database, then the database gets classified as:
Database, tag1, tag2, etc.
In Ranger, child entries are created as below:
(Database): tag1, tag2, etc.
If the resource is a table, the classification is as shown as below:
(Database, table), tag1, tag2, etc. then in Ranger child level entry can be seen as below:
In Ranger, child level entry can be seen as below:
(Database, table): tag1, tag2, etc.
If the resource is a column, on the UI the classification is as shown below:
(Database, table, column), tag1, tag2, etc.
In Ranger, only column level tags will be synced:
(Database, table, column), tag1, tag2. etc.
File System
For a folder or file, all the tag levels are allowed.
For a field, only the same tag level is allowed.
Meta tagging
Meta tags are applied at the table or file level. They are also synced to Apache Ranger at the table or file level. Only system classified and manually classified tags are synced to Apache Ranger.
Post-processing tags
System classified and manually classified tags that are applied using post processing rules are synced to Apache Ranger.
Re-evaluate
In the case of re-evaluation, system classified and manually classified datazone tags are synced to Apache Ranger. Resources that are deleted through datazone policies will be removed from Apache Ranger as well.
Add or edit tags
You can add or edit tags manually on the original classified resources from following pages:
Classifications: From the navigation menu, select Data Inventory > Classifications.
Resource Detail: From the navigation menu, select Data Inventory > Classifications. Select a resource and click Resource Detail.
Data Explorer: From the navigation menu, select Data Inventory > Data Explorer.
Data Zone Dashboard: From the navigation menu, select Compliance Workflow > Data Zone Dashboard.
When a user adds tags manually from the pages listed above, the tag status is set by default to “Accepted : Manually classified” and it will be synced to Apache Ranger.
Add a resource
You can manually add tags to unclassified resources. When you add such resources and add a tag to them, the tag status is set by default to “Accepted : Manually classified” and it will be synced to Apache Ranger.
To add resource, select Data Inventory > Classifications from the navigation menu and click Add Resource.
Tag status changes
Tag status changes will affect TagSync. Only system classified and manually accepted tags will be synced to Apache Ranger. The following are few scenarios for tag status changes:
If the status of a tag is changed from system classified to rejected or allowed, then the tag will be removed from Apache Ranger.
If the status of the tag is changed from manually accepted to allowed or rejected, then the tag will be removed from Apache Ranger.
If the tag status resets to system classified from rejected or allowed, then the tag be synced Apache Ranger.
If the tag status is changed to manually classified from rejected or allowed, then the tag will be synced to Apache Ranger.
If the tag status is changed from system classified to manually classified, then the synced tags in Apache Ranger will remain unchanged.
Remove tags
You can manually remove added tags if you have rejected them. If you remove a tag from a resource using the Add/Edit option, then the tag will be removed from Apache Ranger as soon as you reject it.
Remove resources
If a resource is added manually and has only manually classified tags, then after your reject the last tag the resource will be removed from Apache Ranger.
If a resource has system classified tags and you reject the last tag, the resource will be removed from Apache Ranger as last TagSync for the same resource will get removed.
Rescan of same file
If you rescan a resource that is already synced with Apache Ranger and no changes were made to rules or datazone policies, then TagSync will remain unchanged.
If post-processing rules are disabled, then rescanning a file will remove post-processing tags.
If a datazone tag is disabled or a resource removed from a datazone, then the datazone tag will be removed from Apache Ranger upon rescan.
If a meta tag rule or a meta tag is disabled, then the meta tag will be removed from Apache Ranger upon rescan.
If a status change is applied before a rescan of a file, as per status change TagSync will also affect.
Validate TagSync in Apache Ranger
You can view tags that are getting pushed to Apache Ranger using curl commands as well as using the Ranger tag utility script.
Validate TagSync using curl command
curl -i -L -k -u admin:${PRIVACERA_PASSWORD} -H "Content-type: application/json" -X GET https://${PRIVACERA_HOST}:6182/service/tags/resources/service/privacera_postgres
The above curl command will give the list of resources that are synced to Apache Ranger, but the response of this curl command is not in a readable format. Therefore , it is recommended to use the Ranger tag utility to check TagSync.
Validate TagSync using the Ranger Tag Utility
The following is a Python script created to communicate with all Ranger API methods. This will return the response in a readable format:
Run the following command to download required files:
wget https://privacera.s3.amazonaws.com/public/pm-demo-data/ranger_tag_utility.py -O ranger_tag_utility.py
Download the file on your local system and execute the following command to view the TagSync response.
SSL instance
python3 ranger_tag_utility.py --operation list_tags --host ${PRIVACERA_HOST} --port 6182 --username ${RANGER_USERNAME} --password ${RANGER_PASSWORD} --servicename privacera_redshift --ssl True --verifyssl False
Non-SSL instance
python3 ranger_tag_utility.py --operation list_tags --host ${PRIVACERA_HOST} --port 6080 --username ${RANGER_USERNAME} --password ${RANGER_PASSWORD} --servicename privacera_maprfs --ssl True --verifyssl False
(Optional) Change the service name as per the application.
Output
Received Tag Data for path : ['/testdir/sample_files/file_format/avro/test.avro'] => tags :: ['SSN', 'PERSON_NAME', 'AU_BAN', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/test.snappy.avro'] => tags :: ['US_ADDRESS', 'SSN', 'US_PHONE_NUMBER', 'AU_BAN', 'PERSON_NAME', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/test1.avro'] => tags :: ['SSN', 'US_PHONE_NUMBER', 'PERSON_NAME', 'US_ADDRESS', 'AU_BAN', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/twitter.avro'] => tags :: ['PERSON_NAME', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/twitter.snappy.avro'] => tags :: ['PERSON_NAME', 'TEST_DATAZONE', 'POST_PROCESS']
Adding Tags with Ranger REST API
Prerequisite: Make sure the repo is created on Ranger for tags and Hive has the same tag service selected.
To add a tag using Rest API in Ranger, use the following steps:
Create privacera_tags in the Ranger Tag Based Policy.
Associate the privacera_tags to Hive service.
vi atlas_tag_test.json
Edit the JSON file shown below based on your specific table/tag information.
{ "op": "add_or_update", "serviceName": "dublin_hive", "tagVersion": 0, "tagDefinitions": { "0": { "name": "TEST_TAG", "source": "Atlas", "attributeDefs": [], "id": 0, "isEnabled": true } }, "tags": { "0": { "type": "TEST_TAG", "owner": 0, "attributes": {}, "id": 0, "isEnabled": true } }, "serviceResources": [ { "serviceName": "dublin_hive", "resourceElements": { "database": { "values": [ "db_name" ], "isExcludes": false, "isRecursive": false }, "column": { "values": [ "column_name" ], "isExcludes": false, "isRecursive": false }, "table": { "values": [ "table_name" ], "isExcludes": false, "isRecursive": false } }, "id": 0, "isEnabled": true } ], "resourceToTagIds": { "0": [ 0 ] } }
Update the following variables
serviceName
tagDefinitions[‘0’].name
tags[‘0’].type
serviceResources[0].serviceName
serviceResources[0].resourceElements[‘database’].values[0]
serviceResources[0].resourceElements[‘column’].values[0]
serviceResources[0].resourceElements[‘table’].values[0]
curl -i -L -k -u admin:${RANGER_ADMIN_PASSWORD} \ -H "Content-type: application/json" \ -d @atlas_tag_test.json \ -X PUT http://<RANGER_HOST>:6080/service/tags/importservicetags
Wait for a couple of minutes and run the following:
select * from <database_name>.<table_name>
Hive
Create privacera_tags in the Ranger Tag Based Policy.
Associate the privacera_tags to Hive service.
Create a JSON file where you can add tags.
vi hive_tag.json
Edit the JSON file shown below based on your specific table/tag information.
{ "op": "add_or_update", "serviceName": "${Hive_Service_Name}", "tagVersion": 0, "tagDefinitions": { "0": { "name": "${Tag_Name}", "source": "Atlas", "attributeDefs": [], "id": 0, "isEnabled": true } }, "tags": { "0": { "type": "${Tag_Type}", "owner": 0, "attributes": {}, "id": 0, "isEnabled": true } }, "serviceResources": [ { "serviceName": "${Hive_Service_Name}", "resourceElements": { "database": { "values": [ "${Database}" ], "isExcludes": false, "isRecursive": false }, "table": { "values": [ "${Table}" ], "isExcludes": false, "isRecursive": false }, "column": { "values": [ "${Column}" ], "isExcludes": false, "isRecursive": false } }, "id": 0, "isEnabled": true } ], "resourceToTagIds": { "0": [ 0 ] } }
Sample hive_tag.json
{ "op": "add_or_update", "serviceName": "privacera_hive", "tagVersion": 0, "tagDefinitions": { "0": { "name": "SSN", "source": "Atlas", "attributeDefs": [], "id": 0, "isEnabled": true } }, "tags": { "0": { "type": "SSN", "owner": 0, "attributes": {}, "id": 0, "isEnabled": true } }, "serviceResources": [ { "serviceName": "privacera_hive", "resourceElements": { "database": { "values": [ "finance" ], "isExcludes": false, "isRecursive": false }, "table": { "values": [ "ssn_finance_us" ], "isExcludes": false, "isRecursive": false }, "column": { "values": [ "SocialSecurity" ], "isExcludes": false, "isRecursive": false } }, "id": 0, "isEnabled": true } ], "resourceToTagIds": { "0": [ 0 ] } }
Push the tag to Ranger.
Add Tag
curl -i -L -k -u admin:<RANGER_ADMIN_PASSWORD> -H "Content-type: application/json" -d @hive_tag.json -X PUT http://<RANGER_HOST>:6080/service/tags/importservicetags
Get Tagged Resource
curl -i -L -k -u admin:<RANGER_ADMIN_PASSWORD> -H "Content-type: application/json" -X GET http://<RANGER_HOST>:6080/service/tags/resources
S3
Create privacera_tags in the Ranger Tag Based Policy
Associate the privacera_tags to S3 Service.
Create a JSON file where you can add tags.
vi s3_tag.json
{"op":"add_or_update","serviceName":"${S3_Service_Name}","tagVersion":0,"tagDefinitions":{"0":{"name":"${Tag_Name}","source":"Atlas","attributeDefs":[],"id":0,"isEnabled":true}},"tags":{"0":{"type":"${Tag_Type}","owner":0,"attributes":{},"id":0,"isEnabled":true}},"serviceResources":[{"serviceName":"${S3_Service_Name}","resourceElements":{"bucketname":{"values":["${Bucket_Name}"],"isExcludes":false,"isRecursive":false},"objectpath":{"values":["${Resource_Path_Name}"],"isExcludes":false,"isRecursive":false}},"id":0,"isEnabled":true}],"resourceToTagIds":{"0":[0]}}
Sample JSON:
{"op":"add_or_update","serviceName":"privacera_s3","tagVersion":0,"tagDefinitions":{"0":{"name":"SSN","source":"Atlas","attributeDefs":[],"id":0,"isEnabled":true}},"tags":{"0":{"type":"SSN","owner":0,"attributes":{},"id":0,"isEnabled":true}},"serviceResources":[{"serviceName":"privacera_s3","resourceElements":{"bucketname":{"values":["pscanzone"],"isExcludes":false,"isRecursive":false},"objectpath":{"values":["finance/finance_us.csv"],"isExcludes":false,"isRecursive":false}},"id":0,"isEnabled":true}],"resourceToTagIds":{"0":[0]}}
Push the tag to Ranger.
curl -i -L -k -u admin:welcome1 -H "Content-type: application/json" -d @s3_tag.json -X PUT http://${RANGER_HOST}.privacera.com:6080/service/tags/importservicetags
Response:
HTTP/1.1 204 No Content Set-Cookie: RANGERADMINSESSIONID=517FD2032481415D188C6925FA96E7E3; Path=/; HttpOnly X-Frame-Options: DENY X-XSS-Protection: 1; mode=block Strict-Transport-Security: max-age=31536000; includeSubDomains Content-Security-Policy: default-src 'none'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; connect-src 'self'; img-src 'self'; style-src 'self' 'unsafe-inline';font-src 'self' Cache-Control: no-cache, no-store, max-age=0, must-revalidate Pragma: no-cache Expires: 0 X-Content-Type-Options: nosniff Content-Type: application/json Date: Sun, 08 Mar 2020 18:55:44 GMT Server: Apache Ranger
To get the tagged resources list.
curl -i -L -k -u admin:welcome1 -H "Content-type: application/json" -X GET http://${RANGER_HOST}.privacera.com:6080/service/tags/resources
Response:
[{"id":5,"guid":"6b9234f1-69d9-40b0-9865-fe5bec45b469","isEnabled":true,"createdBy":"Admin","updatedBy":"Admin","createTime":1581570409000,"updateTime":1581570409000,"version":2,"serviceName":"privacera_hive","resourceElements":{"database":{"values":["sales"],"isExcludes":false,"isRecursive":false},"column":{"values":["name"],"isExcludes":false,"isRecursive":false},"table":{"values":["sales_data"],"isExcludes":false,"isRecursive":false}},"resourceSignature":"82a4eb3e2148ee77686538a653dc6d8e027e9b3443b5b09494af6a38db815a64"},{"id":7,"guid":"76ef1384-8432-4ed5-9778-c305bfb6d4c0","isEnabled":true,"createdBy":"Admin","updatedBy":"Admin","createTime":1583715849000,"updateTime":1583715849000,"version":2,"serviceName":"privacera_s3","resourceElements":{"bucketname":{"values":["pscanzone"],"isExcludes":false,"isRecursive":false},"objectpath":{"values":["finance/finance_us.csv"],"isExcludes":false,"isRecursive":false}},"resourceSignature":"02d7ffe3fc9065ed63c935faec14268cc6f3823aa68b2b81a030e5c93cb60843"}]
Test the Tag-Based Policies for S3 with the sample given above:
Create user <kate> in EC2 and add permissions read, metaread, write, metawrite to the S3 bucket ${Bucket_Name} in privacera_s3 service.
Create a deny tag-based policy for user <kate> - tag = SSN, Component = S3, permissions = read, write.
Now try to access the ${Bucket_Name} with user <kate>.
Denied audit is seen with ${SSN} tag in the audits.
REST API endpoints for working tags
Add Tag
curl -i -L -k -u admin:welcome1 \ -H "Content-type: application/json" \ -d @atlas_tag_test.json \ -X PUT http://<RANGER_HOST>:6080/service/tags/importservicetags
Get Tagged Resource
curl -i -L -k -u admin:welcome1 \ -H "Content-type: application/json" \ -X GET http://<RANGER_HOST>:6080/service/tags/resources
Delete Tagged Resource
curl -i -L -k -u admin:welcome1 \ -H "Content-type: application/json" \ -X GET http://<RANGER_HOST>:6080/service/tags/resources
Get ALL Tags
curl -i -L -k -u admin:welcome1 \ -H "Content-type: application/json" \ -X GET http://<RANGER_HOST>:6080/service/tags/tags
Get Tag by ID
>curl -i -L -k -u admin:welcome1 \ -H "Content-type: application/json" \ -X GET http://<RANGER_HOST>:6080/service/tags/tag/<id>
List All Tagged Resources
curl -i -L -k -u admin:welcome1 -H "Content-type: application/json" -X GET http://<RANGER_HOST>:6080/service/tags/resources
List Tag-Resource Mapping
curl -i -L -k -u admin:welcome1 -H "Content-type: application/json" -X GET http://<RANGER_HOST>:6080/service/tags/tagresourcemaps
Get Tagged Resources By ResourceID
curl -i -L -k -u admin:welcome1 -H "Content-type: application/json" -X GET http://<RANGER_HOST>:6080/service/tags/resource/<resourceId>