- Platform Release 6.5
- Privacera Platform Installation
- About Privacera Manager (PM)
- Install overview
- Prerequisites
- Installation
- Default services configuration
- Component services configurations
- Access Management
- Data Server
- PolicySync
- Snowflake
- Redshift
- Redshift Spectrum
- PostgreSQL
- Microsoft SQL Server
- Databricks SQL
- RocksDB
- Google BigQuery
- Power BI
- UserSync
- Privacera Plugin
- Databricks
- Spark standalone
- Spark on EKS
- Trino Open Source
- Dremio
- AWS EMR
- AWS EMR with Native Apache Ranger
- GCP Dataproc
- Starburst Enterprise
- Privacera services (Data Assets)
- Audit Fluentd
- Grafana
- Access Request Manager (ARM)
- Ranger Tagsync
- Discovery
- Encryption & Masking
- Privacera Encryption Gateway (PEG) and Cryptography with Ranger KMS
- AWS S3 bucket encryption
- Ranger KMS
- AuthZ / AuthN
- Security
- Access Management
- Reference - Custom Properties
- Validation
- Additional Privacera Manager configurations
- CLI actions
- Debugging and logging
- Advanced service configuration
- Increase Privacera portal timeout for large requests
- Order of precedence in PolicySync filter
- Configure system properties
- PolicySync
- Databricks
- Table properties
- Upgrade Privacera Manager
- Troubleshooting
- Possible Errors and Solutions in Privacera Manager
-
- Unable to Connect to Docker
- Terminate Installation
- 6.5 Platform Installation fails with invalid apiVersion
- Ansible Kubernetes Module does not load
- Unable to connect to Kubernetes Cluster
- Common Errors/Warnings in YAML Config Files
- Delete old unused Privacera Docker images
- Unable to debug error for an Ansible task
- Unable to upgrade from 4.x to 5.x or 6.x due to Zookeeper snapshot issue
- Storage issue in Privacera UserSync & PolicySync
- Permission Denied Errors in PM Docker Installation
- Unable to initialize the Discovery Kubernetes pod
- Portal service
- Grafana service
- Audit server
- Audit Fluentd
- Privacera Plugin
-
- Possible Errors and Solutions in Privacera Manager
- How-to
- Appendix
- AWS topics
- AWS CLI
- AWS IAM
- Configure S3 for real-time scanning
- Install Docker and Docker compose (AWS-Linux-RHEL)
- AWS S3 MinIO quick setup
- Cross account IAM role for Databricks
- Integrate Privacera services in separate VPC
- Securely access S3 buckets ssing IAM roles
- Multiple AWS account support in Dataserver using Databricks
- Multiple AWS S3 IAM role support in Dataserver
- Azure topics
- GCP topics
- Kubernetes
- Microsoft SQL topics
- Snowflake configuration for PolicySync
- Create Azure resources
- Databricks
- Spark Plug-in
- Azure key vault
- Add custom properties
- Migrate Ranger KMS master key
- IAM policy for AWS controller
- Customize topic and table names
- Configure SSL for Privacera
- Configure Real-time scan across projects in GCP
- Upload custom SSL certificates
- Deployment size
- Service-level system properties
- PrestoSQL standalone installation
- AWS topics
- Privacera Platform User Guide
- Introduction to Privacera Platform
- Settings
- Data inventory
- Token generator
- System configuration
- Diagnostics
- Notifications
- How-to
- Privacera Discovery User Guide
- What is Discovery?
- Discovery Dashboard
- Scan Techniques
- Processing order of scan techniques
- Add and scan resources in a data source
- Start or cancel a scan
- Tags
- Dictionaries
- Patterns
- Scan status
- Data zone movement
- Models
- Disallowed Tags Policy
- Rules
- Types of rules
- Example rules and classifications
- Create a structured rule
- Create an unstructured rule
- Create a rule mapping
- Export rules and mappings
- Import rules and mappings
- Post-processing in real-time and offline scans
- Enable post-processing
- Example of post-processing rules on tags
- List of structured rules
- Supported scan file formats
- Data Source Scanning
- Data Inventory
- TagSync using Apache Ranger
- Compliance Workflow
- Data zones and workflow policies
- Workflow Policies
- Alerts Dashboard
- Data Zone Dashboard
- Data zone movement
- Example Workflow Usage
- Discovery health check
- Reports
- Built-in Reports
- Saved reports
- Offline reports
- Reports with the query builder
- How-to
- Privacera Encryption Guide
- Essential Privacera Encryption terminology
- Install Privacera Encryption
- Encryption Key Management
- Schemes
- Scheme Policies
- Encryption Schemes
- Presentation Schemes
- Masking schemes
- Encryption formats, algorithms, and scopes
- Deprecated encryption formats, algorithms, and scopes
- Encryption with PEG REST API
- PEG REST API on Privacera Platform
- PEG API Endpoint
- Encryption Endpoint Summary for Privacera Platform
- Authentication Methods on Privacera Platform
- Anatomy of the /protect API Endpoint on Privacera Platform
- About Constructing the datalist for protect
- About Deconstructing the datalist for unprotect
- Example of Data Transformation with /unprotect and Presentation Scheme
- Example PEG API endpoints
- /unprotect with masking scheme
- REST API Response Partial Success on Bulk Operations
- Audit Details for PEG REST API Accesses
- REST API Reference
- Make calls on behalf of another user
- Troubleshoot REST API Issues on Privacera Platform
- PEG REST API on Privacera Platform
- Encryption with Databricks, Hive, Streamsets, Trino
- Databricks UDFs for encryption and masking
- Hive UDFs
- Streamsets
- Trino UDFs
- Privacera Access Management User Guide
- Privacera Access Management
- How Polices are evaluated
- Resource policies
- Policies overview
- Creating Resource Based Policies
- Configure Policy with Attribute-Based Access Control
- Configuring Policy with Conditional Masking
- Tag Policies
- Entitlement
- Request Access
- Approve access requests
- Service Explorer
- User/Groups/Roles
- Permissions
- Reports
- Audit
- Security Zone
- Access Control using APIs
- AWS User Guide
- Overview of Privacera on AWS
- Set policies for AWS services
- Using Athena with data access server
- Using DynamoDB with data access server
- Databricks access manager policy
- Accessing Kinesis with data access server
- Accessing Firehose with Data Access Server
- EMR user guide
- AWS S3 bucket encryption
- S3 browser
- Getting started with Minio
- Plugins
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- Privacera documentation changelog
TagSync using Apache Ranger
Privacera Discovery allows you to classify your data using tags. Tags can be used in access policies to manage access to sensitive data.
Apache Ranger requires the tagged information while applying a policy. This topic describes how you can propagate the tag details from Discovery to Apache Ranger.
Enable TagSync
You need to enable TagSync in the Privacera Portal by configuring the following properties in the Application Properties UI. See General Process for more information.
ranger.writer.enable=true send.inherited.table.tags.to.ranger=true
Properties to add based on service type
Apart from above properties, you need to add the additional properties based on service type in Application Properties UI. These properties will help to verify TagSync in Apache Ranger using the Ranger utility script.
For example:
service_name=privacera_s3 cluster_name=privacera
The value of service_name
depends on the application that you want to apply TagSync to. The following is a list of services and values for each application:
S3
service_name=privacera_s3 cluster_name=privacera
Redshift
service_name=privacera_redshift cluster_name=privacera
PostgreSQL
service_name=privacera_postgres cluster_name=privacera
Snowflake
service_name=privacera_snowflake cluster_name=privacera
DynamoDB
service_name=privacera_dynamodb cluster_name=privacera
MSSQL/Synapse
service_name=privacera_mssql cluster_name=privacera
MySql/MariaDB/AuroraDB/Databricks Spark SQL
service_name=privacera_hive cluster_name=privacera
TagSync validation scenarios
TagSync can be validated in the following scenarios:
Note
Allowed and rejected tags will not be synced to Apache Ranger.
Auto scanning
On the Classifications page, files are classified with system classified tags. After classification, all system-classified and manually accepted tags are synced to Apache Ranger.
Parent-Child Level TagSync in Apache Ranger:
Based on database applications or file systems, the following is the criteria to sync parent and child tags:
Database applications
If the resource is a database, then the database gets classified as:
Database, tag1, tag2, etc.
In Ranger, child entries are created as below:
(Database): tag1, tag2, etc.
If the resource is a table, the classification is as shown as below:
(Database, table), tag1, tag2, etc. then in Ranger child level entry can be seen as below:
In Ranger, child level entry can be seen as below:
(Database, table): tag1, tag2, etc.
If the resource is a column, on the UI the classification is as shown below:
(Database, table, column), tag1, tag2, etc.
In Ranger, only column level tags will be synced:
(Database, table, column), tag1, tag2. etc.
File System
For a folder or file, all the tag levels are allowed.
For a field, only the same tag level is allowed.
Meta tagging
Meta tags are applied at the table or file level. They are also synced to Apache Ranger at the table or file level. Only system classified and manually classified tags are synced to Apache Ranger.
Post-processing tags
System classified and manually classified tags that are applied using post processing rules are synced to Apache Ranger.
Re-evaluate
In the case of re-evaluation, system classified and manually classified datazone tags are synced to Apache Ranger. Resources that are deleted through datazone policies will be removed from Apache Ranger as well.
Add or edit tags
You can add or edit tags manually on the original classified resources from following pages:
Classifications: From the navigation menu, select Data Inventory > Classifications.
Resource Detail: From the navigation menu, select Data Inventory > Classifications. Select a resource and click Resource Detail.
Data Explorer: From the navigation menu, select Data Inventory > Data Explorer.
Data Zone Dashboard: From the navigation menu, select Compliance Workflow > Data Zone Dashboard.
When a user adds tags manually from the pages listed above, the tag status is set by default to “Accepted : Manually classified” and it will be synced to Apache Ranger.
Add a resource
You can manually add tags to unclassified resources. When you add such resources and add a tag to them, the tag status is set by default to “Accepted : Manually classified” and it will be synced to Apache Ranger.
To add resource, select Data Inventory > Classifications from the navigation menu and click Add Resource.
Tag status changes
Tag status changes will affect TagSync. Only system classified and manually accepted tags will be synced to Apache Ranger. The following are few scenarios for tag status changes:
If the status of a tag is changed from system classified to rejected or allowed, then the tag will be removed from Apache Ranger.
If the status of the tag is changed from manually accepted to allowed or rejected, then the tag will be removed from Apache Ranger.
If the tag status resets to system classified from rejected or allowed, then the tag be synced Apache Ranger.
If the tag status is changed to manually classified from rejected or allowed, then the tag will be synced to Apache Ranger.
If the tag status is changed from system classified to manually classified, then the synced tags in Apache Ranger will remain unchanged.
Remove tags
You can manually remove added tags if you have rejected them. If you remove a tag from a resource using the Add/Edit option, then the tag will be removed from Apache Ranger as soon as you reject it.
Remove resources
If a resource is added manually and has only manually classified tags, then after your reject the last tag the resource will be removed from Apache Ranger.
If a resource has system classified tags and you reject the last tag, the resource will be removed from Apache Ranger as last TagSync for the same resource will get removed.
Rescan of same file
If you rescan a resource that is already synced with Apache Ranger and no changes were made to rules or datazone policies, then TagSync will remain unchanged.
If post-processing rules are disabled, then rescanning a file will remove post-processing tags.
If a datazone tag is disabled or a resource removed from a datazone, then the datazone tag will be removed from Apache Ranger upon rescan.
If a meta tag rule or a meta tag is disabled, then the meta tag will be removed from Apache Ranger upon rescan.
If a status change is applied before a rescan of a file, as per status change TagSync will also affect.
Validate TagSync in Apache Ranger
You can view tags that are getting pushed to Apache Ranger using curl commands as well as using the Ranger tag utility script.
Validate TagSync using curl command
curl -i -L -k -u admin:${PRIVACERA_PASSWORD} -H "Content-type: application/json" -X GET https://${PRIVACERA_HOST}:6182/service/tags/resources/service/privacera_postgres
The above curl command will give the list of resources that are synced to Apache Ranger, but the response of this curl command is not in a readable format. Therefore , it is recommended to use the Ranger tag utility to check TagSync.
Validate TagSync using the Ranger Tag Utility
The following is a Python script created to communicate with all Ranger API methods. This will return the response in a readable format:
Run the following command to download required files:
wget https://privacera.s3.amazonaws.com/public/pm-demo-data/ranger_tag_utility.py -O ranger_tag_utility.py
Download the file on your local system and execute the following command to view the TagSync response.
SSL instance
python3 ranger_tag_utility.py --operation list_tags --host ${PRIVACERA_HOST} --port 6182 --username ${RANGER_USERNAME} --password ${RANGER_PASSWORD} --servicename privacera_redshift --ssl True --verifyssl False
Non-SSL instance
python3 ranger_tag_utility.py --operation list_tags --host ${PRIVACERA_HOST} --port 6080 --username ${RANGER_USERNAME} --password ${RANGER_PASSWORD} --servicename privacera_maprfs --ssl True --verifyssl False
(Optional) Change the service name as per the application.
Output
Received Tag Data for path : ['/testdir/sample_files/file_format/avro/test.avro'] => tags :: ['SSN', 'PERSON_NAME', 'AU_BAN', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/test.snappy.avro'] => tags :: ['US_ADDRESS', 'SSN', 'US_PHONE_NUMBER', 'AU_BAN', 'PERSON_NAME', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/test1.avro'] => tags :: ['SSN', 'US_PHONE_NUMBER', 'PERSON_NAME', 'US_ADDRESS', 'AU_BAN', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/twitter.avro'] => tags :: ['PERSON_NAME', 'TEST_DATAZONE', 'POST_PROCESS'] Received Tag Data for path : ['/testdir/sample_files/file_format/avro/twitter.snappy.avro'] => tags :: ['PERSON_NAME', 'TEST_DATAZONE', 'POST_PROCESS']