- Welcome to Privacera
- Introduction to Privacera
- Privacera Platform installation
- Plan for Privacera Platform
- Privacera Platform overview
- Privacera Platform installation overview
- Privacera Platform deployment size
- Privacera Platform installation prerequisites
- Choose a cloud provider
- Select a deployment type
- Configure proxy for Privacera Platform
- Prerequisites for installing Privacera Platform on Kubernetes
- Default Privacera Platform port numbers
- Required environment variables for installing Privacera Platform
- Privacera Platform system requirements for Azure
- Prerequisites for installing Privacera Manager on AWS
- Privacera Platform system requirements for Docker in GCP
- Privacera Platform system requirements for Docker in AWS
- Privacera Platform system requirements for Docker in Azure
- Privacera Platform system requirements for Google Cloud Platform (GCP)
- System requirements for Privacera Manager Host in GKE
- System requirements for Privacera Manager Host in EKS
- System requirements for Privacera Manager Host in AKS
- Install Privacera Platform
- Download the Privacera Platform installation packages
- Privacera Manager overview
- Install Privacera Manager on Privacera Platform
- Install Privacera Platform using an air-gapped install
- Upgrade Privacera Manager
- Troubleshoot Privacera Platform installation
- Validate Privacera Platform installation
- Common errors and warnings in Privacera Platform YAML config files
- Ansible Kubernetes Module does not load on Privacera Platform
- Unable to view Audit Fluentd audits on Privacera Platform
- Unable to view Audit Server audits on Privacera Platform
- No space for Docker images on Privacera Platform
- Unable to see metrics on Grafana dashboard
- Increase storage for Privacera PolicySync on Kubernetes
- Permission denied errors in PM Docker installation
- Non-portal users can access restricted Privacera Platform resources
- Storage issue in Privacera Platform UserSync and PolicySync
- Privacera Manager not responding
- Unable to Connect to Docker
- Privacera Manager unable to connect to Kubernetes Cluster
- Unable to initialize the Discovery Kubernetes pod
- Unable to upgrade from 4.x to 5.x or 6.x due to Zookeeper snapshot issue
- 6.5 Platform Installation fails with invalid apiVersion
- Database lockup in Docker
- Remove the WhiteLabel Error Page on Privacera Platform
- Unable to start the Privacera Platform portal service
- Connect applications to Privacera Platform for Access Management
- Connect applications to Privacera Platform using the Data Access Server
- Data Access Server overview
- Integrate AWS with Privacera Platform using the Data Access Server
- Integrate GCS and GCP with Privacera Platform using the Data Access Server
- Integrate ADLS with Privacera Platform using the Data Access Server
- Access Kinesis with the Data Access Server on Privacera Platform
- Access Firehose with Data Access Server on Privacera Platform
- Use DynamoDB with Data Access Server on Privacera Platform
- Connect MinIO to Privacera Platform using the Data Access Server
- Use Athena with Data Access Server on Privacera Platform
- Custom Data Access Server properties
- Connect applications to Privacera Platform using the Privacera Plugin
- Overview of Privacera plugins for Databricks
- Connect Starburst Trino Open Source to Privacera Platform via Plug-In
- Connect Trino Open Source to Privacera Platform via plugin
- Connect AWS EMR with Native Apache Ranger to Privacera Platform
- Configure Databricks Spark Fine-Grained Access Control Plugin [FGAC] [Python, SQL]
- Configure Databricks Spark Fine-Grained Access Control Plugin [FGAC] [Python, SQL]
- Configure EMR with Privacera Platform
- EMR user guide for Privacera Platform
- Connect Kafka datasource via plugin to Privacera Platform
- Connect PrestoSQL standalone to Privacera Platform using Privacera plugin
- Connect Starburst Enterprise to Privacera Platform via plugin
- Connect GCP Dataproc to Privacera Platform using Privacera plugin
- Connect Spark standalone to Privacera Platform using the Privacera plugin
- Connect Amazon EKS to Privacera Platform using Privacera plugin
- Privacera Spark plugin versus Open-source Spark plugin
- Connect Dremio to Privacera Platform via plugin
- Connect applications to Privacera Platform using the Data Access Server
- Connect portal users to Privacera Platform
- Connect Privacera Platform portal users from LDAP
- Set up portal SSO for Privacera Platform with OneLogin using SAML
- Set up portal SSO for Privacea Platform with Okta using SAML
- Set up portal SSO for Privacera Platform with Okta using OAuth
- Set up portal SSO for Privacera Platform with AAD using SAML
- Set up portal SSO for Privacera Platform with PingFederate
- Generate an Okta Identity Provider metadata file and URL
- Configure AuditServer on Privacera Platform
- Configure Solr destination on Privacera Platform
- Enable Solr authentication on Privacera Platform
- Solr properties on Privacera Platform
- Configure Kafka destination on Privacera Platform
- Enable Pkafka for real-time audits in Discovery on Privacera Platform
- AuditServer properties on Privacera Platform
- Configure Fluentd audit logging on Privacera Platform
- Configure High Availability for Privacera Platform
- Configure Privacera Platform system security
- Privacera Platform system security
- Configure SSL for Privacera Platform
- Enable CA-signed certificates on Privacera Platform
- Enable self-signed certificates on Privacera Platform
- Upload custom SSL certificates on Privacera Platform
- Custom Crypto properties on Privacera Platform
- Enable password encryption for Privacera Platform services
- Authenticate Privacera Platform services using JSON Web Tokens
- Configure JSON Web Tokens for Databricks
- Configure JSON Web Tokens for EMR FGAC Spark
- Custom configurations for Privacera Platform
- Privacera Platform system configuration
- Add custom properties using Privacera Manager on Privacera Platform
- Privacera Platform system properties files overview
- Add domain names for Privacera service URLs on Privacera Platform
- Configure Azure PostgreSQL on Privacera Platform
- Spark Standalone properties on Privacera Platform
- AWS Data Access Server properties on Privacera Platform
- Add custom Spark configuration for Databricks on Privacera Platform
- Configure proxy for Privacera Platform
- Configure Azure MySQL on Privacera Platform
- System-level settings for Zookeeper on Privacera Platform
- Configure service name for Databricks Spark plugin on Privacera Platform
- Migrate Privacera Manager from one instance to another
- Restrict access in Kubernetes on Privacera Platform
- System-level settings for Grafana on Privacera Platform
- System-level settings for Ranger KMS on Privacera Platform
- Generate verbose logs on Privacera Platform
- System-level settings for Spark on Privacera Platform
- System-level settings for Azure ADLS on Privacera Platform
- Override Databricks region URL mapping for Privacera Platform on AWS
- Configure Privacera Platform system properties
- EMR custom properties
- Configure AWS Aurora DB (PostgreSQL/MySQL) on Privacera Platform
- Merge Kubernetes configuration files
- Scala Plugin properties on Privacera Platform
- System-level settings for Trino Open Source on Privacera Platform
- System-level settings for Kafka on Privacera Platform
- System-level settings for Graphite on Privacera Platform
- System-level settings for Spark plugin on Privacera Platform
- Create CloudFormation stack
- Configure pod topology for Kubernetes on Privacera Platform
- Configure proxy for Kubernetes on Privacera Platform
- Externalize access to Privacera Platform services with NGINX Ingress
- Custom Privacera Platform portal properties
- Add Data Subject Rights
- Enable or disable the Data Sets menu
- Kubernetes RBAC
- Spark FGAC properties
- Audit Fluentd properties on Privacera Platform
- Privacera Platform on AWS overview
- Privacera Platform Portal overview
- AWS Identity and Access Management (IAM) on Privacera Platform
- Set up AWS S3 MinIO on Privacera Platform
- Integrate Privacera services in separate VPC
- Install Docker and Docker compose (AWS-Linux-RHEL) on Privacera Platform
- Multiple AWS S3 IAM role support in Data Access Server
- Enable AWS CLI on Privacera Platform
- Configure S3 for real-time scanning on Privacera Platform
- Multiple AWS account support in Dataserver using Databricks on Privacera Platform
- Enable AWS CLI
- AWS S3 Commands - Ranger Permission Mapping
- Plan for Privacera Platform
- PrivaceraCloud setup
- PrivaceraCloud data access methods
- Create PrivaceraCloud account
- Log in to PrivaceraCloud with or without SSO
- Connect applications to PrivaceraCloud
- Connect applications to PrivaceraCloud with the setup wizard
- Connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud
- Connect Amazon Textract to PrivaceraCloud
- Connect Athena to PrivaceraCloud
- Connect Aurora to PrivaceraCloud
- Azure Data Factory Integration with Privacera Enabled Databricks Cluster
- Connect Google BigQuery to PrivaceraCloud
- Connect Cassandra to PrivaceraCloud for Discovery
- Connect Databricks to PrivaceraCloud
- Connect Databricks SQL to PrivaceraCloud
- Connect Databricks to PrivaceraCloud
- Configure Databricks SQL PolicySync on PrivaceraCloud
- Databricks SQL fields on PrivaceraCloud
- Databricks SQL Masking Functions
- Connect Databricks SQL to Hive policy repository on PrivaceraCloud
- Enable Privacera Encryption services in Databricks SQL on PrivaceraCloud
- Example: Create basic policies for table access
- Connect Dataproc to PrivaceraCloud
- Connect Dremio to PrivaceraCloud
- Connect DynamoDB to PrivaceraCloud
- Connect Elastic MapReduce from Amazon application to PrivaceraCloud
- Connect EMR application
- EMR Spark access control types
- PrivaceraCloud configuration
- AWS IAM roles using CloudFormation setup
- Create a security configuration
- Create EMR cluster
- Kerberos required for EMR FGAC or OLAC
- Create EMR cluster using CloudFormation setup (Recommended)
- Create EMR cluster using CloudFormation EMR templates
- EMR template: Spark_OLAC, Hive, Trino (for EMR versions 6.4.0 and above)
- EMR Template for Multiple Master Node: Spark_OLAC, Hive, Trino (for EMR version 6.4.0 and above)
- EMR template: Spark_OLAC, Hive, PrestoSQL (for EMR versions 6.x to 6.3.1)
- EMR template: Spark_OLAC, Hive, PrestoDB (For EMR version 5.33.1)
- EMR template: Spark_FGAC, Hive, Trino (for EMR versions 6.4.0 and above)
- EMR Template for Multiple Master Node: Spark_FGAC, Hive, Trino (for EMR version 6.4.0 and above)
- EMR template: Spark_FGAC, Hive, PrestoSQL (for EMR versions 6.x to 6.3.1)
- EMR template: Spark_FGAC, Hive, PrestoDB (for EMR version 5.33.1)'
- Create EMR cluster using CloudFormation AWS CLI
- Create CloudFormation stack
- Create EMR cluster using CloudFormation EMR templates
- Manually create EMR cluster using AWS EMR console
- EMR Native Ranger Integration with PrivaceraCloud
- Connect EMRFS S3 to PrivaceraCloud
- Connect Files to PrivaceraCloud
- Connect Google Cloud Storage to PrivaceraCloud
- Connect Glue to PrivaceraCloud
- Connect Kinesis to PrivaceraCloud
- Connect Lambda to PrivaceraCloud
- Connect MS SQL to PrivaceraCloud
- Connect MySQL to PrivaceraCloud for Discovery
- Connect Open Source Apache Spark to PrivaceraCloud
- Connect Oracle to PrivaceraCloud for Discovery
- Connect PostgreSQL to PrivaceraCloud
- Connect Power BI to PrivaceraCloud
- Connect Presto to PrivaceraCloud
- Connect Redshift to PrivaceraCloud
- Redshift Spectrum PrivaceraCloud overview
- Connect Snowflake to PrivaceraCloud
- Starburst Enterprise with PrivaceraCloud
- Connect Starbrust Trino to PrivaceraCloud
- Connect Starburst Enterprise Presto to PrivaceraCloud
- Connect Synapse to PrivaceraCloud
- Connect S3 to PrivaceraCloud
- Connect Trino to PrivaceraCloud
- Starburst Trino and Trino SQL command permissions
- Starburst Trino and Trino SQL command permissions - Iceberg connector
- Manage applications on PrivaceraCloud
- Connect users to PrivaceraCloud
- Data sources on PrivaceraCloud
- PrivaceraCloud custom configurations
- Access AWS S3 buckets from multiple AWS accounts on PrivaceraCloud
- Configure multiple JWTs for EMR
- Access cross-account SQS queue for PostgreSQL audits on PrivaceraCloud
- AWS Access with IAM role on PrivaceraCloud
- Databricks cluster deployment matrix with Privacera plugin
- Whitelist py4j security manager via S3 or DBFS
- General functions in PrivaceraCloud settings
- Cross account IAM role for Databricks
- Operational status of PrivaceraCloud and RSS feed
- How to get support
- Access Management
- Get started with Access Management
- Users, groups, and roles
- UserSync
- Add UserSync connectors
- UserSync connector properties on Privacera Platform
- UserSync connector fields on PrivaceraCloud
- UserSync system properties on Privacera Platform
- About Ranger UserSync
- Customize user details on sync
- UserSync integrations
- SCIM Server User-Provisioning on PrivaceraCloud
- Azure Active Directory UserSync integration on Privacera Platform
- LDAP UserSync integration on Privacera Platform
- Policies
- How polices are evaluated
- General approach to validating policy
- Resource policies
- About service groups on PrivaceraCloud
- Service/Service group global actions
- Create resource policies: general steps
- About secure database views
- PolicySync design on Privacera Platform
- PolicySync design and configuration on Privacera Platform
- Relationships: policy repository, connector, and datasource
- PolicySync topologies
- Connector instance directory/file structure
- Required basic PolicySync topology: always at least one connector instance
- Optional topology: multiple connector instances for Kubernetes pods and Docker containers
- Recommended PolicySync topology: individual policy repositories for individual connectors
- Optional encryption of property values
- Migration to PolicySync v2 on Privacera Platform 7.2
- Databricks SQL connector for PolicySync on Privacera Platform
- Databricks SQL connector properties for PolicySync on Privacera Platform
- Dremio connector for PolicySync on Privacera Platform
- Dremio connector properties for PolicySync on Privacera Platform
- Google BigQuery connector for PolicySync on Privacera Platform
- BigQuery connector properties for PolicySync on Privacera Platform
- Microsoft SQL Server connector for PolicySync on Privacera Platform
- Microsoft SQL connector properties for PolicySync on Privacera Platform
- PostgreSQL connector for PolicySync on Privacera Platform
- PostgreSQL connector properties for PolicySync on Privacera Platform
- Power BI connector for PolicySync
- Power BI connector properties for PolicySync on Privacera Platform
- Redshift and Redshift Spectrum connector for PolicySync
- Redshift and Redshift Spectrum connector properties for PolicySync on Privacera Platform
- Snowflake connector for PolicySync on Privacera Platform
- Snowflake connector properties for PolicySync on Privacera Platform
- PolicySync design and configuration on Privacera Platform
- Configure resource policies
- Configure ADLS resource policies
- Configure AWS S3 resource policies
- Configure Athena resource policies
- Configure Databricks resource policies
- Configure DynamoDB resource policies
- Configure Files resource policies
- Configure GBQ resource policies
- Configure GCS resource policies
- Configure Glue resource policies
- Configure Hive resource policy
- Configure Lambda resource policies
- Configure Kafka resource policies
- Configure Kinesis resource policies
- Configure MSSQL resource policies
- Configure PowerBI resource policies
- Configure Presto resource policies
- Configure Postgres resource policies
- Configure Redshift resource policies
- Configure Snowflake resource policies
- Configure Policy with Attribute-Based Access Control (ABAC) on PrivaceraCloud
- Attribute-based access control (ABAC) macros
- Configure access policies for AWS services on Privacera Platform
- Configure policy with conditional masking on Privacera Platform
- Create access policies for Databricks on Privacera Platform
- Order of precedence in PolicySync filter
- Example: Manage access to Databricks SQL with Privacera
- Service/service group global actions on the Resource Policies page
- Tag policies
- Policy configuration settings
- Security zones
- Manage Databricks policies on Privacera Platform
- Use a custom policy repository with Databricks
- Configure policy with Attribute-Based Access Control on Privacera Platform
- Create Databricks policies on Privacera Platform
- Example: Create basic policies for table access
- Examples of access control via programming
- Secure S3 via Boto3 in Databricks notebook
- Other Boto3/Pandas examples to secure S3 in Databricks notebook with PrivaceraCloud
- Secure Azure file via Azure SDK in Databricks notebook
- Control access to S3 buckets with AWS Lambda function on PrivaceraCloud or Privacera Platform
- Service Explorer
- Audits
- Required permissions to view audit logs on Privacera Platform
- About PolicySync access audit records and policy ID on Privacera Platform
- View audit logs
- View PEG API audit logs
- Generate audit logs using GCS lineage
- Configure Audit Access Settings on PrivaceraCloud
- Configure AWS RDS PostgreSQL instance for access audits
- Accessing PostgreSQL Audits in GCP
- Configure Microsoft SQL server for database synapse audits
- Examples of audit search
- Reports
- Discovery
- Get started with Discovery
- Planning for Privacera Discovery
- Install and Enable Privacera Discovery
- Set up Discovery on Privacera Platform
- Set up Discovery on AWS for Privacera Platform
- Set up Discovery on Azure for Privacera Platform
- Set up Discovery on Databricks for Privacera Platform
- Set up Discovery on GCP for Privacera Platform
- Enable Pkafka for real-time audits in Discovery on Privacera Platform
- Customize topic and table names on Privacera Platform
- Enable Discovery on PrivaceraCloud
- Scan resources
- Supported file formats for Discovery Scans
- Privacera Discovery scan targets
- Processing order of scan techniques
- Register data sources on Privacera Platform
- Data sources on Privacera Platform
- Add a system data source on Privacera Platform
- Add a resource data source on Privacera Platform
- Add AWS S3 application data source on Privacera Platform
- Add Azure ADLS data source on Privacera Platform
- Add Databricks Spark SQL data source on Privacera Platform
- Add Google BigQuery (GBQ) data source on Privacera Platform
- Add JDBC-based systems as data sources for Discovery on Privacera Platform
- Add Google Pub-Sub data source on Privacera Platform
- Add Google Cloud Storage data source on Privacera Platform
- Set up cross-project scanning on Privacera Platform
- Google Pub-Sub Topic message scan on Privacera Platform
- Add and scan resources in a data source
- Start a scan
- Start offline and realtime scans
- Scan Status overview
- Cancel a scan
- Trailing forward slash (/) in data source URLs/URIs
- Configure Discovery scans
- Tags
- Add Tags
- Import Tags
- Add, edit, or delete Tag attributes
- Edit Tag descriptions
- Delete Tags
- Export Tags
- Search for Tags
- Fetch AWS S3 Tags
- Propagate Privacera Discovery Tags to Ranger
- TagSync using Apache Ranger on Privacera Platform
- Add Tags with Ranger REST API
- Dictionaries
- Types of dictionaries
- Dictionary Keys
- Manage dictionaries
- Default dictionaries
- Add a dictionary
- Import a dictionary
- Upload a dictionary
- Enable or disable a dictionary
- Include a Dictionary
- Exclude a dictionary
- Add keywords to an included dictionary
- Edit a dictionary
- Copy a dictionary
- Export a dictionary
- Search for a dictionary
- Test dictionaries
- Dictionary tour
- Patterns
- Models
- Rules
- Configure scans
- Scan setup
- Adjust default scan depth on Privacera Platform
- Classifications using random sampling on PrivaceraCloud
- Enable Discovery Realtime Scanning Using IAM Role on PrivaceraCloud
- Enable Real-time Scanning on ADLS Gen 2 on PrivaceraCloud
- Enable Real-time Scanning of S3 Buckets on PrivaceraCloud
- Connect ADLS Gen2 Application for Data Discovery on PrivaceraCloud
- Include and exclude resources in GCS
- Configure real-time scan across projects in GCP
- Enable offline scanning on ADLS Gen 2 on PrivaceraCloud
- Include and exclude datasets and tables in GBQ
- Google Sink to Pub/Sub
- Tags
- Data Zones on Privacera Platform
- Planing data zones on Privacera Platform
- Data Zone Dashboard
- Enable data zones on Privacera Platform
- Add resources to a data zone on Privacera Platform
- Create a data zone on Privacera Platform
- Edit data zones on Privacera Platform
- Delete data zones on Privacera Platform
- Import data zones on Privacera Platform
- Export data zones on Privacera Platform
- Disable data zones on Privacera Platform
- Create tags for data zones on Privacera Platform
- Data zone movement
- Data zones overview
- Configure data zone policies on Privacera Platform
- Encryption for Right to Privacy (RTP) on Privacera Platform
- Workflow policy use case example
- Define Discovery policies on Privacera Platform
- Disallowed Groups policy
- Disallowed Movement Policy
- Compliance Workflow policies on Privacera Platform
- De-identification policy
- Disallowed Subnets Policy
- Disallowed Subnet Range Policy
- Disallowed Tags policy
- Expunge policy
- Disallowed Users Policy
- Right to Privacy policy
- Workflow Expunge Policy
- Workflow policy
- View scanned resources
- Discovery reports and dashboards
- Alerts Dashboard
- Discovery Dashboard
- Built-in reports
- Offline reports
- Saved Reports
- Reports with the Query Builder
- Discovery Health Check
- Set custom Discovery properties on Privacera Platform
- Get started with Discovery
- Encryption
- Get started with Encryption
- The encryption process
- Encryption architecture and UDF flow
- Install Encryption on Privacera Platform
- Encryption on Privacera Platform deployment specifications
- Configure Ranger KMS with Azure Key Vault on Privacera Platform
- Enable telemetry data collection on Privacera Platform
- AWS S3 bucket encryption on Privacera Platform
- Set up PEG and Cryptography with Ranger KMS on Privacera Platform
- Provide user access to Ranger KMS
- PEG custom properties
- Enable Encryption on PrivaceraCloud
- Encryption keys
- Master Key
- Key Encryption Key (KEK)
- Data Encryption Key (DEK)
- Encrypted Data Encryption Key (EDEK)
- Rollover encryption keys on Privacera Platform
- Connect to Azure Key Vault with a client ID and certificate on Privacera Platform
- Connect to Azure Key Vault with Client ID and Client Secret on Privacera Platform
- Migrate Ranger KMS master key on Privacera Platform
- Ranger KMS with Azure Key Vault on Privacera Platform
- Schemes
- Encryption schemes
- Presentation schemes
- Masking schemes
- Scheme policies
- Formats
- Algorithms
- Scopes
- Deprecated encryption schemes
- About LITERAL
- User-defined functions (UDFs)
- Encryption UDFs for Apache Spark on PrivaceraCloud
- Hive UDFs for encryption on Privacera Platform
- StreamSets Data Collector (SDC) and Privacera Encryption on Privacera Platform
- Trino UDFs for encryption and masking on Privacera Platform
- Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera crypto plugin for Trino
- Install the Privacera crypto plugin for Trino using Privacera Manager
- privacera.unprotect with optional presentation scheme
- Example queries to verify Privacera-supplied UDFs
- Privacera Encryption UDFs for Starburst Enterprise Trino on PrivaceraCloud
- Syntax of Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera Crypto plug-in for Trino
- Download and install Privacera Crypto jar
- Set variables in Trino etc/crypto.properties
- Restart Trino to register the Privacera encryption and masking UDFs for Trino
- Example queries to verify Privacera-supplied UDFs
- Privacera Encryption UDF for masking in Trino on PrivaceraCloud
- Databricks UDFs for Encryption
- Create Privacera protect UDF
- Create Privacera unprotect UDF
- Run sample queries in Databricks to verify
- Create a custom path to the crypto properties file in Databricks
- Create and run Databricks UDF for masking
- Privacera Encryption UDF for masking in Databricks on PrivaceraCloud
- Set up Databricks encryption and masking
- Get started with Encryption
- API
- REST API Documentation for Privacera Platform
- Access Control using APIs on Privacera Platform
- UserSync REST endpoints on Privacera Platform
- REST API endpoints for working tags on Privacera Platform
- PEG REST API on Privacera Platform
- API authentication methods on Privacera Platform
- Anatomy of the /protect API endpoint on Privacera Platform
- Construct the datalist for protect
- Deconstruct the datalist for unprotect
- Example of data transformation with /unprotect and presentation scheme
- Example PEG API endpoints
- /unprotect with masking scheme
- REST API response partial success on bulk operations
- Audit details for PEG REST API accesses
- REST API reference
- Make calls on behalf of another user on Privacera Platform
- Troubleshoot REST API Issues on Privacera Platform
- Encryption API date input formats
- Supported day-first date input formats
- Supported month-first date input formats
- Supported year-first date input formats
- Examples of supported date input formats
- Supported date ranges
- Day-first formats
- Date input formats and ranges
- Legend for date input formats
- Year-first formats
- Supported date range
- Month-first formats
- Examples of allowable date input formats
- PEG REST API on PrivaceraCloud
- REST API prerequisites
- Anatomy of a PEG API endpoint on PrivaceraCloud
- About constructing the datalist for /protect
- About deconstructing the response from /unprotect
- Example of data transformation with /unprotect and presentation scheme
- Example PEG REST API endpoints for PrivaceraCloud
- Audit details for PEG REST API accesses
- Make calls on behalf of another user on PrivaceraCloud
- Apache Ranger API on PrivaceraCloud
- API Key on PrivaceraCloud
- Administration and Releases
- Privacera Platform administration
- Portal user management
- Change password for Privacera Platform services
- Generate tokens on Privacera Platform
- Validations on Privacera Platform
- Health check on Privacera Platform
- Event notifications for system health
- Export or import a configuration file on Privacera Platform
- Logs on Privacera Platform
- Increase Privacera Platform portal timeout for large requests
- Platform Support Policy and End-of-Support Dates
- Enable Grafana metrics on Privacera Platform
- Enable Azure CLI on Privacera Platform
- Migrate from PrestoSQL to Trino
- Ranger Admin properties on Privacera Platform
- Event notifications for system health
- Metrics
- Get ADLS properties
- PrivaceraCloud administration
- About the Account page on PrivaceraCloud
- Statistics on PrivaceraCloud
- PrivaceraCloud dashboard
- Event notifications for system health
- Metrics
- Usage statistics on PrivaceraCloud
- Update PrivaceraCloud account info
- Manage PrivaceraCloud accounts
- Create and manage IP addresses on PrivaceraCloud
- Scripts for AWS CLI or Azure CLI for managing connected applications
- Add UserInfo in S3 Requests sent via Data Access Server on PrivaceraCloud
- Previews
- PrivaceraCloud previews
- Preview: Scan Electronic Health Records with NER Model
- Preview: File Explorer for GCS
- Preview: File Explorer for Azure
- Preview: OneLogin setup for SAML-SSO
- Preview: File Explorer for S3
- Preview: PingFederate UserSync
- Preview: Azure Active Directory SCIM Server UserSync
- Preview: OneLogin UserSync
- Privacera UserSync Configuration
- Preview: Governed Data Sharing on PrivaceraCloud
- Overview of Governed Data Sharing on PrivaceraCloud
- Concepts in Governed Data Sharing
- Supported Applications
- Prerequisites and planning
- Additional features
- Applications and database resources
- Granular permissions on resources
- Automatic expiry of access for shared datasets or projects
- At-a-glance dashboards by role
- Optional data steward
- Privacera Discovery scans by admin or data owner
- Optional project leader
- Optional terms of use
- Discoverability of shared datasets
- User request access to datasets
- Notifications
- Overview to examples by role
- Databricks Partner Connect - Quickstart for Unity Catalog
- What do I need to do in my Databricks Workspace?
- Where is the sample dataset in my Databricks Workspace?
- Databricks Unity Catalog Tutorial
- Troubleshooting the Databricks Unity Catalog tutorial
- Preview: Governed Data Sharing on PrivaceraCloud
- Overview of Governed Data Sharing on PrivaceraCloud
- Concepts in Governed Data Sharing
- Supported Applications
- Prerequisites and planning
- Additional features
- Applications and database resources
- Granular permissions on resources
- Automatic expiry of access for shared datasets or projects
- At-a-glance dashboards by role
- Optional data steward
- Privacera Discovery scans by admin or data owner
- Optional project leader
- Optional terms of use
- Discoverability of shared datasets
- User request access to datasets
- Notifications
- Overview to examples by role
- Preview: AWS Lake Formation configuration on PrivaceraCloud
- Get started with AWS Lake Formation
- Create IAM Role for Lake Formation connector
- Connect Lake Formation application on PrivaceraCloud
- Create Lake Formation connectors for multiple AWS regions
- Setup Tag Policy Repository for Lake Formation connector
- Setup access policy repository for Lake Formation
- Setup access policy repository for Hive
- Setup audit logs for Lake Formation on PrivaceraCloud
- How to validate a Lake Formation connector
- Lake Formation FAQs
- Privacera Platform previews
- Preview: AlloyDB connector for PolicySync
- Preview: AWS Lake Formation configuration on Privacera Platform
- Get started with AWS Lake Formation
- Create IAM Role for Lake Formation connector for Platform
- Configure Lake Formation connector on Privacera Platform
- Create Lake Formation connectors for multiple AWS regions for Platform
- Setup Tag Policy Repository for Lake Formation connector
- Setup access policy repository for Lake Formation
- Setup access policy repository for Hive
- Setup audit logs for Lake Formation on Platform
- How to validate a Lake Formation connector
- Lake Formation FAQs
- Lake Formation Connector Properties
- PrivaceraCloud previews
- Release documentation
- Previous versions of Privacera Platform documentation
- PrivaceraCloud Release Notes
- Updates in PrivaceraCloud release 7.7
- Updates in PrivaceraCloud release 7.6
- Updates in PrivaceraCloud release 7.5
- Updates in PrivaceraCloud release 7.4
- Updates in PrivaceraCloud release 7.3
- Updates in PrivaceraCloud release 7.2
- Updates in PrivaceraCloud release 7.1
- PrivaceraCloud browser compatibility
- Documentation changelog
- Known Issues in PrivaceraCloud release
- Privacera Platform release notes
- Privacera documentation changelog
- Privacera system security initiatives
- Privacera Platform administration
Configure EMR with Privacera Platform
This topic shows how to configure EMR with Privacera using Privacera Manager.
Kerberos required for EMR FGAC or OLAC
Note
To support Privacera FGAC or OLAC, the EMR application must be configured with Kerberos.
SSH to the instance as USER.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.emr.yml config/custom-vars/ vi config/custom-vars/vars.emr.yml
Edit the following properties.
Property
Description
Example
EMR_ENABLE
Enable EMR template creation.
true
EMR_CLUSTER_NAME
Define a unique name for the EMR cluster.
Privacera-
EMREMR_CREATE_SG
Set this to true if you don't have existing security groups and want Privacera Manager to take care of adding security group creation steps in the EMR CF template.
false
EMR_MASTER_SG_ID
If
EMR_CREATE_SG
is false, set this property. Security Group ID for EMR Master Node Group.sg-xxxxxxx
EMR_SLAVE_SG_ID
If
EMR_CREATE_SG
is false, set this property. Security Group ID for EMR Slave Node Group.sg-xxxxxxx
EMR_SERVICE_ACCESS_SG_ID
If
EMR_CREATE_SG
is false, set this property. Security Group ID for EMR ServiceAccessSecurity. Fill this property only if you are creating EMR in a Private Network.sg-xxxxxxx
EMR_SG_VPC_ID
If
EMR_CREATE_SG
is true, set this property. VPC ID in which you want to create the EMR Cluster.vpc-xxxxxxxxxxx
EMR_MASTER_SG_NAME
If
EMR_CREATE_SG
is true, set this property. Security Group Name for EMR Master Node Group. The security group name will be added to theemr-template.json
.priv-master-sg
EMR_SLAVE_SG_NAME
If
EMR_CREATE_SG
is true, set this property. Security Group Name for EMR Slave Node Group. The security group name will be added to theemr-template.json
.priv-slave-sg
EMR_SERVICE_ACCESS_SG_NAME
If
EMR_CREATE_SG
is true, set this property. Security Group Name for EMR ServiceAccessSecurity. The security group name will be added to theemr-template.json
. Fill this property only if you are creating EMR in a Private Network.priv-private-sg
EMR_SUBNET_ID
Subnet ID
EMR_KEYPAIR
An existing EC2 key pair to SSH into the master node of the cluster.
privacera-test-pair
EMR_EC2_MARKET_TYPE
Set market type as SPOT or
ON_DEMAND
.SPOT
EMR_EC2_INSTANCE_TYPE
Set the instance type. Instances can be of different types such as m5.xlarge, r5.xlarge and so on.
m5.large
EMR_MASTER_NODE_COUNT
Node count for Master. The number of nodes can be 1, 2 and so on.
1
EMR_CORE_NODE_COUNT
Node count for Core. The number of cores can be 1, 2 and so on.
1
EMR_VERSION
Version of EMR.
emr-x.xx.x
EMR_EC2_DOMAIN
Domain used by the nodes. It depends on EMR Region, for example, ".ec2.internal" is for us-east-1.
.ec2.internal
EMR_USE_STS_REGIONAL_ENDPOINTS
Set the property to enable/disable regional endpoints for S3 requests.
Default value is
false
.true
EMR_TERMINATION_PROTECT
Set to enable/disable termination protection.
true
EMR_LOGS_PATH
S3 location for storing EMR logs.
s3://privacera-logs-bucket/
EMR_KERBEROS_ENABLE
Set to true if you want to enable kerberization on EMR.
false
EMR_KDC_ADMIN_PASSWORD
If
EMR_KERBEROS_ENABLE
is true, set this property. The password used within the cluster for the kadmin service.EMR_CROSS_REALM_PASSWORD
If
EMR_KERBEROS_ENABLE
is true, set this property. The cross-realm trust principal password, which must be identical across realms.EMR_SECURITY_CONFIG
Name of the Security Configurations created for EMR. This can be a pre-created configuration, or Privacera Manager can generate a template through which you can create this configuration.
EMR_KERB_TICKET_LIFETIME
Set this property if you want Privacera Manager to create CF template for creating security configuration and
EMR_KERBEROS_ENABLE
is true. The period for which a Kerberos ticket issued by the cluster’s KDC is valid. Cluster applications and services auto-renew tickets after they expire.EMR_KERB_TICKET_LIFETIME: 24
EMR_KERB_REALM
Set this property if you want Privacera Manager to create CF template for creating security configuration and
EMR_KERBEROS_ENABLE
is true. The Kerberos realm name for the other realm in the trust relationship.EMR_KERB_DOMAIN
Set this property if you want Privacera Manager to create CF template for creating security configuration and
EMR_KERBEROS_ENABLE
is true. The domain name of the other realm in the trust relationship.EMR_KERB_ADMIN_SERVER
Set this property if you want Privacera Manager to create CF template for creating security configuration and
EMR_KERBEROS_ENABLE
is true. The fully qualified domain name (FQDN) and an optional port for the Kerberos admin server in the other realm. If a port is not specified, 749 is used.EMR_KERB_KDC_SERVER
Set this property if you want Privacera Manager to create CF template for creating security configuration and
EMR_KERBEROS_ENABLE
is true. The fully qualified domain name (FQDN) and an optional port for the KDC in the other realm. If a port is not specified, 88 is used.EMR_AWS_ACCT_ID
AWS Account ID where EMR Cluster resides
9999999
EMR_DEFAULT_ROLE
Default role attached to EMR Cluster for performing cluster-related activities. This should be a pre-created role.
EMR_DefaultRole
EMR_ROLE_FOR_CLUSTER_NODES
The IAM Role will be attached to each node in the EMR Cluster.
This should have only minimal permissions for downloading the
privacera_cust_conf.zip
and basic EMR capabilities. It can be an existing one, if not, you can use the IAM role CF template to generate it after the Privacera Manager update.restricted_node_role
EMR_USE_SINGLE_ROLE_FOR_APPS
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. Create a Single IAM Role that will be used by All EMR Applications.
true
EMR_ROLE_FOR_APPS
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by all EMR Apps
app_data_access_role
EMR_ROLE_FOR_SPARK
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. Create multiple IAM Roles to be used by specific applications. Set
EMR_USE_SINGLE_ROLE_FOR_APPS
to be false. IAM Role name which will be used by Spark Application (Dataserver) for data access.spark_data_access_role
EMR_ROLE_FOR_HIVE
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by Hive Application for data access.
hive_data_access_role
EMR_ROLE_FOR_PRESTO
If you want Privacera Manager to generate a CF template for IAM roles configuration, set this property. IAM Role name which will be used by Presto Application for data access.
presto_data_access_role
EMR_HIVE_METASTORE
Metastore type. e.g. "glue", "hive" (For external hive-metastore)
glue
EMR_HIVE_METASTORE_PATH
S3 location for hive metastore
s3://hive-warehouse
EMR_HIVE_METASTORE_CONNECTION_URL
If
EMR_HIVE_METASTORE
is hive, set this property. JDBC Connection URL for connecting to hive.jdbc:mysql://<jdbc-host>:3306/<hive-db-name>?createDatabaseIfNotExist=true
EMR_HIVE_METASTORE_CONNECTION_DRIVER
If
EMR_HIVE_METASTORE
is hive, set this property. JDBC Driver Nameorg.mariadb.jdbc.Driver
EMR_HIVE_METASTORE_CONNECTION_USERNAME
If
EMR_HIVE_METASTORE
is hive, set this property. JDBC UserNamehive
EMR_HIVE_METASTORE_CONNECTION_PASSWORD
If EMR_HIVE_METASTORE is hive, set this property. JDBC Password
StRong@PassW0rd
EMR_HIVE_SERVICE_NAME
Custom hive service name for hive application in EMR
teamA_policy
EMR_TRINO_HIVE_SERVICE_NAME
Custom hive service name for trino application in EMR
teamB_policy
EMR_SPARK_HIVE_SERVICE_NAME
Custom hive access service name for spark applications in EMR
teamC_policy
EMR_APP_SPARK_OLAC_ENABLE
To install Spark application with Privacera plugin, set the property to true. OLAC is known as Object Level Access Control.
Note
Recommended when complete access control on the objects in AWS S3 is required.
When the property is set to true, s3 and s3n protocols will not be supported on EMR clusters while running Spark queries.
true
EMR_APP_SPARK_FGAC_ENABLE
To install Spark application with Privacera plugin, set the property to true. FGAC is known as Fine Grained Access Control for Table and Column.
Note
Recommended for compliance purposes, since the whole cluster will still have direct access to AWS S3 data.
false
EMR_APP_PRESTO_DB_ENABLE
To install PrestoDB application with Privacera plugin, set the property to true.
PrestoDB and Trino are mutually exclusive. Only one should be enabled at a time.
false
EMR_APP_PRESTO_SQL_ENABLE
To install Trino application with Privacera plugin, set the property to true.
PrestoDB and Trino are mutually exclusive. Only one should be enabled at a time.
Note
Trino is supported for EMR versions 6.1.0 and higher.
Note
If the EMR version is 6.4.0, setting this flag installs the Trino plugin.
false
EMR_APP_HIVE_ENABLE
To install Hive application with Privacera plugin, set the property to true.
true
EMR_APP_ZEPPELIN_ENABLE
To install Zeppelin application, set the property to true.
true
EMR_APP_LIVY_ENABLE
To install Livy application, set the property to true.
true
EMR_CUST_CONF_ZIP_PATH
A path where the
privacera_cust_conf.zip
file will be placed should be added. Privacera Manager will generate aprivacera_cust_conf.zip
under~/privacera/privacera-manager/output/emr
folder. Thisprivacera_cust_conf.zip
needs to be placed at an s3 or any https location from which the EMR cluster can download it.s3://privacera-artifacts/
EMR_SPARK_ENABLE_VIEW_LEVEL_ACCESS_CONTROL
Set the property to true to enable view-level column masking and row filter for SparkSQL. The property can be used only when you set
EMR_APP_SPARK_FGAC_ENABLE
totrue
.false
EMR_RANGER_IS_FALLBACK_SUPPORTED
Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.
To enable the fallback, set to true; to disable, set to false.
true
EMR_SPARK_DELTA_LAKE_ENABLE
Set this property to true to enable Delta Lake on EMR Spark.
true
EMR_SPARK_DELTA_LAKE_CORE_JAR_DOWNLOAD_URL
Download URL of Delta Lake core JAR. The Delta Lake core JAR has dependency with Spark version.
You have to find the appropriate version for your EMR. See Delta Lake compatibility with Apache Spark.
Update this property with the download URL for the appropriate Delta Lake core JAR download link and update this property with this value. See the Maven release page for Delta Core.
For example, for Spark version 3.1.x, the download URL is
https://repo1.maven.org/maven2/io/delta/delta-core_2.12/1.0.1/delta-core_2.12-1.0.1.jar
.https://repo1.maven.org/maven2/io/delta/delta-core_2.12/1.0.1/delta-core_2.12-1.0.1.jar
If your cluster was running while External Hive Metastore was down, and you are unable to connect to it, restart the following three servers.
sudo systemctl restart hive-hcatalog-server sudo systemctl restart hive-server2 sudo systemctl restart presto-server
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
After the update is finished, all the cloud-formation JSON template files and
privacera_cust_conf.zip
will be available at the path,~/privacera/privacera-manager/output/emr
.Configure and run the following in AWS instance where Privacera is installed.
(Optional) Create IAM roles using the
emr-roles-creation-template.json
template. Run the following command.aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-role-creation --template-body file://emr-roles-creation-template.json --capabilities CAPABILITY_NAMED_IAM
Note
This will create IAM roles with minimal permissions. You can add bucket permissions into respective IAM roles as per your requirements.
(Optional) Create Security Configurations using the
emr-security-config-template.json
template. Run the following command.aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-security-config-creation --template-body file://emr-security-config-template.json
Confirm the
privacera_cust_conf.zip
file has been copied to the location specified inEMR_CUST_CONF_ZIP_PATH
.Create EMR using the
emr-template.json
template. Run the following command.aws --region <AWS-REGION> cloudformation create-stack --stack-name privacera-emr-creation --template-body file://emr-template.json
Note
If you are upgrading EMR to version 6.4 and higher from EMR version <=6.3 to use Trino plug-in, then you must re-create the EMR security configuration based on the new template generated via PM since the security configuration has
trino
user newly added
Note
For PrestoDB, secrets encryption of Solr authentication password is not supported. However, the properties file where the password resides is accessible only to the presto service user, hence it is invulnerable.
If your cluster was running while External Hive Metastore was down, and you are unable to connect to it, restart the following three servers:
sudo systemctl restart hive-hcatalog-server sudo systemctl restart hive-server2 sudo systemctl restart presto-server