- Welcome to Privacera
- Introduction to Privacera
- Governed Data Stewardship
- Concepts in Governed Data Stewardship
- Prerequisites and planning
- Tailor Governed Data Sharing
- Overview to examples by role
- PrivaceraCloud setup
- PrivaceraCloud data access methods
- Create PrivaceraCloud account
- Log in to PrivaceraCloud with or without SSO
- Connect applications to PrivaceraCloud
- Connect applications to PrivaceraCloud with the setup wizard
- Connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud
- Connect Amazon Textract to PrivaceraCloud
- Connect Athena to PrivaceraCloud
- Connect AWS Lake Formation on PrivaceraCloud
- Get started with AWS Lake Formation
- Create IAM Role for AWS Lake Formation connector
- Connect AWS Lake Formation application on PrivaceraCloud
- Create AWS Lake Formation connectors for multiple AWS regions
- Configuring audit logs for the AWS Lake Formation on PrivaceraCloud
- How to validate a AWS Lake Formation connector
- AWS Lake Formation FAQs for Pull mode
- AWS Lake Formation FAQs for Push mode
- Azure Data Factory Integration with Privacera Enabled Databricks Cluster
- Connect Google BigQuery to PrivaceraCloud
- Connect Cassandra to PrivaceraCloud for Discovery
- Connect Databricks to PrivaceraCloud
- Connect Databricks SQL to PrivaceraCloud
- Connect Databricks to PrivaceraCloud
- Configure Databricks SQL PolicySync on PrivaceraCloud
- Databricks SQL fields on PrivaceraCloud
- Databricks SQL Masking Functions
- Connect Databricks SQL to Hive policy repository on PrivaceraCloud
- Enable Privacera Encryption services in Databricks SQL on PrivaceraCloud
- Example: Create basic policies for table access
- Connect Databricks Unity Catalog to PrivaceraCloud
- Enable Privacera Access Management for Databricks Unity Catalog
- Enable Data Discovery for Databricks Unity Catalog
- Databricks Unity Catalog connector fields for PolicySync on PrivaceraCloud
- Configure Audits for Databricks Unity Catalog on PrivaceraCloud
- Databricks Partner Connect - Quickstart for Unity Catalog
- Connect Dataproc to PrivaceraCloud
- Connect Dremio to PrivaceraCloud
- Connect DynamoDB to PrivaceraCloud
- Connect Elastic MapReduce from Amazon application to PrivaceraCloud
- Connect EMR application
- EMR Spark access control types
- PrivaceraCloud configuration
- AWS IAM roles using CloudFormation setup
- Create a security configuration
- Create EMR cluster
- Kerberos required for EMR FGAC or OLAC
- Create EMR cluster using CloudFormation setup (Recommended)
- Create EMR cluster using CloudFormation EMR templates
- EMR template: Spark_OLAC, Hive, Trino (for EMR versions 6.4.0 and above)
- EMR Template for Multiple Master Node: Spark_OLAC, Hive, Trino (for EMR version 6.4.0 and above)
- EMR template: Spark_OLAC, Hive, PrestoSQL (for EMR versions 6.x to 6.3.1)
- EMR template: Spark_FGAC, Hive, Trino (for EMR versions 6.4.0 and above)
- EMR Template for Multiple Master Node: Spark_FGAC, Hive, Trino (for EMR version 6.4.0 and above)
- EMR template: Spark_FGAC, Hive, PrestoSQL (for EMR versions 6.x to 6.3.1)
- Create EMR cluster using CloudFormation AWS CLI
- Create CloudFormation stack
- Create EMR cluster using CloudFormation EMR templates
- Manually create EMR cluster using AWS EMR console
- EMR Native Ranger Integration with PrivaceraCloud
- Connect EMRFS S3 to PrivaceraCloud
- Connect Files to PrivaceraCloud
- Connect Google Cloud Storage to PrivaceraCloud
- Connect Glue to PrivaceraCloud
- Connect Kinesis to PrivaceraCloud
- Connect Lambda to PrivaceraCloud
- Connect MS SQL to PrivaceraCloud
- Connect MySQL to PrivaceraCloud for Discovery
- Connect Open Source Apache Spark to PrivaceraCloud
- Connect Oracle to PrivaceraCloud for Discovery
- Connect PostgreSQL to PrivaceraCloud
- Connect Power BI to PrivaceraCloud
- Connect Presto to PrivaceraCloud
- Connect Redshift to PrivaceraCloud
- Redshift Spectrum PrivaceraCloud overview
- Connect Snowflake to PrivaceraCloud
- Starburst Enterprise with PrivaceraCloud
- Connect Starbrust Trino to PrivaceraCloud
- Connect Starburst Enterprise Presto to PrivaceraCloud
- Connect Synapse to PrivaceraCloud
- Connect S3 to PrivaceraCloud
- Connect Trino to PrivaceraCloud
- Starburst Trino and Trino SQL command permissions
- Starburst Trino and Trino SQL command permissions - Iceberg connector
- Connect Vertica to PrivaceraCloud
- Manage applications on PrivaceraCloud
- Connect users to PrivaceraCloud
- Data sources on PrivaceraCloud
- PrivaceraCloud custom configurations
- Access AWS S3 buckets from multiple AWS accounts on PrivaceraCloud
- Configure multiple JWTs for EMR
- Access cross-account SQS queue for PostgreSQL audits on PrivaceraCloud
- AWS Access with IAM role on PrivaceraCloud
- Databricks cluster deployment matrix with Privacera plugin
- Whitelist py4j security manager via S3 or DBFS
- General functions in PrivaceraCloud settings
- Cross account IAM role for Databricks
- Operational status of PrivaceraCloud and RSS feed
- Troubleshooting the Databricks Unity Catalog tutorial
- Privacera Platform installation
- Plan for Privacera Platform
- Privacera Platform overview
- Privacera Platform installation overview
- Privacera Platform deployment size
- Privacera Platform installation prerequisites
- Choose a cloud provider
- Select a deployment type
- Configure proxy for Privacera Platform
- Prerequisites for installing Privacera Platform on Kubernetes
- Default Privacera Platform port numbers
- Required environment variables for installing Privacera Platform
- Privacera Platform system requirements for Azure
- Prerequisites for installing Privacera Manager on AWS
- Privacera Platform system requirements for Docker in GCP
- Privacera Platform system requirements for Docker in AWS
- Privacera Platform system requirements for Docker in Azure
- Privacera Platform system requirements for Google Cloud Platform (GCP)
- System requirements for Privacera Manager Host in GKE
- System requirements for Privacera Manager Host in EKS
- System requirements for Privacera Manager Host in AKS
- Install Privacera Platform
- Download the Privacera Platform installation packages
- Privacera Manager overview
- Install Privacera Manager on Privacera Platform
- Install Privacera Platform using an air-gapped install
- Upgrade Privacera Manager
- Troubleshoot Privacera Platform installation
- Validate Privacera Platform installation
- Common errors and warnings in Privacera Platform YAML config files
- Ansible Kubernetes Module does not load on Privacera Platform
- Unable to view Audit Fluentd audits on Privacera Platform
- Unable to view Audit Server audits on Privacera Platform
- No space for Docker images on Privacera Platform
- Unable to see metrics on Grafana dashboard
- Increase storage for Privacera PolicySync on Kubernetes
- Permission denied errors in PM Docker installation
- Non-portal users can access restricted Privacera Platform resources
- Storage issue in Privacera Platform UserSync and PolicySync
- Privacera Manager not responding
- Unable to Connect to Docker
- Privacera Manager unable to connect to Kubernetes Cluster
- Unable to initialize the Discovery Kubernetes pod
- Unable to upgrade from 4.x to 5.x or 6.x due to Zookeeper snapshot issue
- 6.5 Platform Installation fails with invalid apiVersion
- Database lockup in Docker
- Remove the WhiteLabel Error Page on Privacera Platform
- Unable to start the Privacera Platform portal service
- Connect portal users to Privacera Platform
- Connect Privacera Platform portal users from LDAP
- Set up portal SSO for Privacera Platform with OneLogin using SAML
- Set up portal SSO for Privacea Platform with Okta using SAML
- Set up portal SSO for Privacera Platform with Okta using OAuth
- Set up portal SSO for Privacera Platform with AAD using SAML
- Set up portal SSO for Privacera Platform with PingFederate
- Generate an Okta Identity Provider metadata file and URL
- Connect applications to Privacera Platform for Access Management
- Connect applications to Privacera Platform using the Data Access Server
- Data Access Server overview
- Integrate AWS with Privacera Platform using the Data Access Server
- Integrate GCS and GCP with Privacera Platform using the Data Access Server
- Integrate ADLS with Privacera Platform using the Data Access Server
- Access Kinesis with the Data Access Server on Privacera Platform
- Access Firehose with Data Access Server on Privacera Platform
- Use DynamoDB with Data Access Server on Privacera Platform
- Connect MinIO to Privacera Platform using the Data Access Server
- Use Athena with Data Access Server on Privacera Platform
- Custom Data Access Server properties
- Connect applications to Privacera Platform using the Privacera Plugin
- Overview of Privacera plugins for Databricks
- Connect AWS EMR with Native Apache Ranger to Privacera Platform
- Configure Databricks Spark Fine-Grained Access Control Plugin [FGAC] [Python, SQL]
- Configure Databricks Spark Object-level Access Control Plugin
- Connect Dremio to Privacera Platform via plugin
- Connect Amazon EKS to Privacera Platform using Privacera plugin
- Configure EMR with Privacera Platform
- EMR user guide for Privacera Platform
- Connect GCP Dataproc to Privacera Platform using Privacera plugin
- Connect Kafka datasource via plugin to Privacera Platform
- Connect PrestoSQL standalone to Privacera Platform using Privacera plugin
- Connect Spark standalone to Privacera Platform using the Privacera plugin
- Privacera Spark plugin versus Open-source Spark plugin
- Connect Starburst Enterprise to Privacera Platform via plugin
- Connect Starburst Trino Open Source to Privacera Platform via Plug-In
- Connect Trino Open Source to Privacera Platform via plugin
- Connect applications to Privacera Platform using the Data Access Server
- Configure AuditServer on Privacera Platform
- Configure Solr destination on Privacera Platform
- Enable Solr authentication on Privacera Platform
- Solr properties on Privacera Platform
- Configure Kafka destination on Privacera Platform
- Enable Pkafka for real-time audits in Discovery on Privacera Platform
- AuditServer properties on Privacera Platform
- Configure Fluentd audit logging on Privacera Platform
- Configure High Availability for Privacera Platform
- Configure Privacera Platform system security
- Privacera Platform system security
- Configure SSL for Privacera Platform
- Enable CA-signed certificates on Privacera Platform
- Enable self-signed certificates on Privacera Platform
- Upload custom SSL certificates on Privacera Platform
- Custom Crypto properties on Privacera Platform
- Enable password encryption for Privacera Platform services
- Authenticate Privacera Platform services using JSON Web Tokens
- Configure JSON Web Tokens for Databricks
- Configure JSON Web Tokens for EMR FGAC Spark
- Custom configurations for Privacera Platform
- Privacera Platform system configuration
- Add custom properties using Privacera Manager on Privacera Platform
- Privacera Platform system properties files overview
- Add domain names for Privacera service URLs on Privacera Platform
- Configure Azure PostgreSQL on Privacera Platform
- Spark Standalone properties on Privacera Platform
- AWS Data Access Server properties on Privacera Platform
- Add custom Spark configuration for Databricks on Privacera Platform
- Configure proxy for Privacera Platform
- Configure Azure MySQL on Privacera Platform
- System-level settings for Zookeeper on Privacera Platform
- Configure service name for Databricks Spark plugin on Privacera Platform
- Migrate Privacera Manager from one instance to another
- Restrict access in Kubernetes on Privacera Platform
- System-level settings for Grafana on Privacera Platform
- System-level settings for Ranger KMS on Privacera Platform
- Generate verbose logs on Privacera Platform
- System-level settings for Spark on Privacera Platform
- System-level settings for Azure ADLS on Privacera Platform
- Override Databricks region URL mapping for Privacera Platform on AWS
- Configure Privacera Platform system properties
- EMR custom properties
- Configure AWS Aurora DB (PostgreSQL/MySQL) on Privacera Platform
- Merge Kubernetes configuration files
- Scala Plugin properties on Privacera Platform
- System-level settings for Trino Open Source on Privacera Platform
- System-level settings for Kafka on Privacera Platform
- System-level settings for Graphite on Privacera Platform
- System-level settings for Spark plugin on Privacera Platform
- Create CloudFormation stack
- Configure pod topology for Kubernetes on Privacera Platform
- Configure proxy for Kubernetes on Privacera Platform
- Externalize access to Privacera Platform services with NGINX Ingress
- Custom Privacera Platform portal properties
- Add Data Subject Rights
- Enable or disable the Data Sets menu
- Kubernetes RBAC
- Spark FGAC properties
- Audit Fluentd properties on Privacera Platform
- Switch from Kinesis to Kafka for Privacera Discovery queuing on AWS with Privacera Platform
- Privacera Platform on AWS overview
- Privacera Platform Portal overview
- AWS Identity and Access Management (IAM) on Privacera Platform
- Set up AWS S3 MinIO on Privacera Platform
- Integrate Privacera services in separate VPC
- Install Docker and Docker compose (AWS-Linux-RHEL) on Privacera Platform
- Configure EFS for Kubernetes on AWS for Privacera Platform
- Multiple AWS accounts support in DataServer
- Multiple AWS S3 IAM role support in Data Access Server
- Enable AWS CLI on Privacera Platform
- Configure S3 for real-time scanning on Privacera Platform
- Multiple AWS account support in Dataserver using Databricks on Privacera Platform
- Enable AWS CLI
- AWS S3 Commands - Ranger Permission Mapping
- Plan for Privacera Platform
- How to get support
- Access Management
- Get started with Access Management
- Users, groups, and roles
- UserSync
- Add UserSync connectors
- UserSync connector properties on Privacera Platform
- UserSync connector fields on PrivaceraCloud
- UserSync system properties on Privacera Platform
- About Ranger UserSync
- Customize user details on sync
- UserSync integrations
- SCIM Server User-Provisioning on PrivaceraCloud
- Azure Active Directory UserSync integration on Privacera Platform
- LDAP UserSync integration on Privacera Platform
- Policies
- How polices are evaluated
- General approach to validating policy
- Resource policies
- About service groups on PrivaceraCloud
- Service/Service group global actions
- Create resource policies: general steps
- About secure database views
- PolicySync design on Privacera Platform
- PolicySync design and configuration on Privacera Platform
- Relationships: policy repository, connector, and datasource
- PolicySync topologies
- Connector instance directory/file structure
- Required basic PolicySync topology: always at least one connector instance
- Optional topology: multiple connector instances for Kubernetes pods and Docker containers
- Recommended PolicySync topology: individual policy repositories for individual connectors
- Optional encryption of property values
- Migration to PolicySync v2 on Privacera Platform 7.2
- Databricks SQL connector for PolicySync on Privacera Platform
- Databricks SQL connector properties for PolicySync on Privacera Platform
- Dremio connector for PolicySync on Privacera Platform
- Dremio connector properties for PolicySync on Privacera Platform
- Configure AWS Lake Formation on Privacera Platform
- Get started with AWS Lake Formation
- Create IAM Role for AWS Lake Formation connector for Platform
- Configure AWS Lake Formation connector on Privacera Platform
- Create AWS Lake Formation connectors for multiple AWS regions for Platform
- Setup audit logs for AWS Lake Formation on Platform
- How to validate a AWS Lake Formation connector
- AWS Lake Formation FAQs for Pull mode
- AWS Lake Formation FAQs for Push mode
- AWS Lake Formation Connector Properties
- Google BigQuery connector for PolicySync on Privacera Platform
- BigQuery connector properties for PolicySync on Privacera Platform
- Microsoft SQL Server connector for PolicySync on Privacera Platform
- Microsoft SQL connector properties for PolicySync on Privacera Platform
- PostgreSQL connector for PolicySync on Privacera Platform
- PostgreSQL connector properties for PolicySync on Privacera Platform
- Power BI connector for PolicySync
- Power BI connector properties for PolicySync on Privacera Platform
- Redshift and Redshift Spectrum connector for PolicySync
- Redshift and Redshift Spectrum connector properties for PolicySync on Privacera Platform
- Snowflake connector for PolicySync on Privacera Platform
- Snowflake connector properties for PolicySync on Privacera Platform
- PolicySync design and configuration on Privacera Platform
- Configure resource policies
- Configure ADLS resource policies
- Configure AWS S3 resource policies
- Configure Athena resource policies
- Configure Databricks resource policies
- Configure DynamoDB resource policies
- Configure Files resource policies
- Configure GBQ resource policies
- Configure GCS resource policies
- Configure Glue resource policies
- Configure Hive resource policy
- Configure Lambda resource policies
- Configure Kafka resource policies
- Configure Kinesis resource policies
- Configure MSSQL resource policies
- Configure PowerBI resource policies
- Configure Presto resource policies
- Configure Postgres resource policies
- Configure Redshift resource policies
- Configure Snowflake resource policies
- Configure Policy with Attribute-Based Access Control (ABAC) on PrivaceraCloud
- Attribute-based access control (ABAC) macros
- Configure access policies for AWS services on Privacera Platform
- Configure policy with conditional masking on Privacera Platform
- Create access policies for Databricks on Privacera Platform
- Order of precedence in PolicySync filter
- Example: Manage access to Databricks SQL with Privacera
- Service/service group global actions on the Resource Policies page
- Tag policies
- Policy configuration settings
- Security zones
- Manage Databricks policies on Privacera Platform
- Databricks Unity Catalog row filtering and native masking on PrivaceraCloud
- Use a custom policy repository with Databricks
- Configure policy with Attribute-Based Access Control on Privacera Platform
- Create Databricks policies on Privacera Platform
- Example: Create basic policies for table access
- Examples of access control via programming
- Secure S3 via Boto3 in Databricks notebook
- Other Boto3/Pandas examples to secure S3 in Databricks notebook with PrivaceraCloud
- Secure Azure file via Azure SDK in Databricks notebook
- Control access to S3 buckets with AWS Lambda function on PrivaceraCloud or Privacera Platform
- Service Explorer
- Audits
- Required permissions to view audit logs on Privacera Platform
- About PolicySync access audit records and policy ID on Privacera Platform
- View audit logs
- View PEG API audit logs
- Generate audit logs using GCS lineage
- Configure Audit Access Settings on PrivaceraCloud
- Configure AWS RDS PostgreSQL instance for access audits
- Accessing PostgreSQL Audits in Azure
- Accessing PostgreSQL Audits in GCP
- Configure Microsoft SQL server for database synapse audits
- Examples of audit search
- Reports
- Discovery
- Get started with Discovery
- Planning for Privacera Discovery
- Install and Enable Privacera Discovery
- Set up Discovery on Privacera Platform
- Set up Discovery on AWS for Privacera Platform
- Set up Discovery on Azure for Privacera Platform
- Set up Discovery on Databricks for Privacera Platform
- Set up Discovery on GCP for Privacera Platform
- Enable Pkafka for real-time audits in Discovery on Privacera Platform
- Customize topic and table names on Privacera Platform
- Enable Discovery on PrivaceraCloud
- Scan resources
- Supported file formats for Discovery Scans
- Privacera Discovery scan targets
- Processing order of scan techniques
- Register data sources on Privacera Platform
- Data sources on Privacera Platform
- Add a system data source on Privacera Platform
- Add a resource data source on Privacera Platform
- Add AWS S3 application data source on Privacera Platform
- Add Azure ADLS data source on Privacera Platform
- Add Databricks Spark SQL data source on Privacera Platform
- Add Google BigQuery (GBQ) data source on Privacera Platform
- Add Google Pub-Sub data source on Privacera Platform
- Add Google Cloud Storage data source on Privacera Platform
- Set up cross-project scanning on Privacera Platform
- Google Pub-Sub Topic message scan on Privacera Platform
- Add JDBC-based systems as data sources for Discovery on Privacera Platform
- Add and scan resources in a data source
- Start a scan
- Start offline and realtime scans
- Scan Status overview
- Cancel a scan
- Trailing forward slash (/) in data source URLs/URIs
- Configure Discovery scans
- Tags
- Add Tags
- Import Tags
- Add, edit, or delete Tag attributes
- Edit Tag descriptions
- Delete Tags
- Export Tags
- Search for Tags
- Fetch AWS S3 Tags
- Propagate Privacera Discovery Tags to Ranger
- TagSync using Apache Ranger on Privacera Platform
- Add Tags with Ranger REST API
- Dictionaries
- Types of dictionaries
- Dictionary Keys
- Manage dictionaries
- Default dictionaries
- Add a dictionary
- Import a dictionary
- Upload a dictionary
- Enable or disable a dictionary
- Include a Dictionary
- Exclude a dictionary
- Add keywords to an included dictionary
- Edit a dictionary
- Copy a dictionary
- Export a dictionary
- Search for a dictionary
- Test dictionaries
- Dictionary tour
- Patterns
- Models
- Rules
- Configure scans
- Scan setup
- Adjust default scan depth on Privacera Platform
- Classifications using random sampling on PrivaceraCloud
- Enable Discovery Realtime Scanning Using IAM Role on PrivaceraCloud
- Enable Real-time Scanning on ADLS Gen 2 on PrivaceraCloud
- Enable Real-time Scanning of S3 Buckets on PrivaceraCloud
- Connect ADLS Gen2 Application for Data Discovery on PrivaceraCloud
- Include and exclude resources in GCS
- Configure real-time scan across projects in GCP
- Enable offline scanning on ADLS Gen 2 on PrivaceraCloud
- Include and exclude datasets and tables in GBQ
- Google Sink to Pub/Sub
- Tags
- Data zones on Privacera Platform
- Planing data zones on Privacera Platform
- Data Zone Dashboard
- Enable data zones on Privacera Platform
- Add resources to a data zone on Privacera Platform
- Create a data zone on Privacera Platform
- Edit data zones on Privacera Platform
- Delete data zones on Privacera Platform
- Import data zones on Privacera Platform
- Export data zones on Privacera Platform
- Disable data zones on Privacera Platform
- Create tags for data zones on Privacera Platform
- Data zone movement
- Data zones overview
- Configure data zone policies on Privacera Platform
- Encryption for Right to Privacy (RTP) on Privacera Platform
- Workflow policy use case example
- Define Discovery policies on Privacera Platform
- Disallowed Groups policy
- Disallowed Movement Policy
- Compliance Workflow policies on Privacera Platform
- De-identification policy
- Disallowed Subnets Policy
- Disallowed Subnet Range Policy
- Disallowed Tags policy
- Expunge policy
- Disallowed Users Policy
- Right to Privacy policy
- Workflow Expunge Policy
- Workflow policy
- View scanned resources
- Discovery reports and dashboards
- Alerts Dashboard
- Discovery Dashboard
- Built-in reports
- Offline reports
- Saved Reports
- Reports with the Query Builder
- Discovery Health Check
- Set custom Discovery properties on Privacera Platform
- Get started with Discovery
- Encryption
- Get started with Encryption
- The encryption process
- Encryption architecture and UDF flow
- Install Encryption on Privacera Platform
- Encryption on Privacera Platform deployment specifications
- Configure Ranger KMS with Azure Key Vault on Privacera Platform
- Enable telemetry data collection on Privacera Platform
- AWS S3 bucket encryption on Privacera Platform
- Set up PEG and Cryptography with Ranger KMS on Privacera Platform
- Provide user access to Ranger KMS
- PEG custom properties
- Enable Encryption on PrivaceraCloud
- Encryption keys
- Master Key
- Key Encryption Key (KEK)
- Data Encryption Key (DEK)
- Encrypted Data Encryption Key (EDEK)
- Rollover encryption keys on Privacera Platform
- Connect to Azure Key Vault with a client ID and certificate on Privacera Platform
- Connect to Azure Key Vault with Client ID and Client Secret on Privacera Platform
- Migrate Ranger KMS master key on Privacera Platform
- Ranger KMS with Azure Key Vault on Privacera Platform
- Schemes
- Encryption schemes
- Presentation schemes
- Masking schemes
- Scheme policies
- Formats
- Algorithms
- Scopes
- Deprecated encryption schemes
- About LITERAL
- User-defined functions (UDFs)
- Encryption UDFs for Apache Spark on PrivaceraCloud
- Hive UDFs for encryption on Privacera Platform
- StreamSets Data Collector (SDC) and Privacera Encryption on Privacera Platform
- Trino UDFs for encryption and masking on Privacera Platform
- Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera crypto plugin for Trino
- Install the Privacera crypto plugin for Trino using Privacera Manager
- privacera.unprotect with optional presentation scheme
- Example queries to verify Privacera-supplied UDFs
- Privacera Encryption UDFs for Starburst Enterprise Trino on PrivaceraCloud
- Syntax of Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera Crypto plug-in for Trino
- Download and install Privacera Crypto jar
- Set variables in Trino etc/crypto.properties
- Restart Trino to register the Privacera encryption and masking UDFs for Trino
- Example queries to verify Privacera-supplied UDFs
- Privacera Encryption UDF for masking in Trino on PrivaceraCloud
- Databricks UDFs for Encryption
- Create Privacera protect UDF
- Create Privacera unprotect UDF
- Run sample queries in Databricks to verify
- Create a custom path to the crypto properties file in Databricks
- Create and run Databricks UDF for masking
- Privacera Encryption UDF for masking in Databricks on PrivaceraCloud
- Set up Databricks encryption and masking
- Get started with Encryption
- API
- REST API Documentation for Privacera Platform
- Access Control using APIs on Privacera Platform
- UserSync REST endpoints on Privacera Platform
- REST API endpoints for working tags on Privacera Platform
- PEG REST API on Privacera Platform
- API authentication methods on Privacera Platform
- Anatomy of the /protect API endpoint on Privacera Platform
- Construct the datalist for protect
- Deconstruct the datalist for unprotect
- Example of data transformation with /unprotect and presentation scheme
- Example PEG API endpoints
- /unprotect with masking scheme
- REST API response partial success on bulk operations
- Audit details for PEG REST API accesses
- REST API reference
- Make calls on behalf of another user on Privacera Platform
- Troubleshoot REST API Issues on Privacera Platform
- Encryption API date input formats
- Supported day-first date input formats
- Supported month-first date input formats
- Supported year-first date input formats
- Examples of supported date input formats
- Supported date ranges
- Day-first formats
- Date input formats and ranges
- Legend for date input formats
- Year-first formats
- Supported date range
- Month-first formats
- Examples of allowable date input formats
- PEG REST API on PrivaceraCloud
- REST API prerequisites
- Anatomy of a PEG API endpoint on PrivaceraCloud
- About constructing the datalist for /protect
- About deconstructing the response from /unprotect
- Example of data transformation with /unprotect and presentation scheme
- Example PEG REST API endpoints for PrivaceraCloud
- Audit details for PEG REST API accesses
- Make calls on behalf of another user on PrivaceraCloud
- Apache Ranger API on PrivaceraCloud
- API Key on PrivaceraCloud
- Administration and Releases
- Privacera Platform administration
- Portal user management
- Change password for Privacera Platform services
- Generate tokens on Privacera Platform
- Validations on Privacera Platform
- Health check on Privacera Platform
- Event notifications for system health
- Export or import a configuration file on Privacera Platform
- Logs on Privacera Platform
- Increase Privacera Platform portal timeout for large requests
- Platform Support Policy and End-of-Support Dates
- Enable Grafana metrics on Privacera Platform
- Enable Azure CLI on Privacera Platform
- Migrate from Databricks Spark to Apache Spark
- Migrate from PrestoSQL to Trino
- Ranger Admin properties on Privacera Platform
- Basic steps for blue/green upgrade of Privacera Platform
- Event notifications for system health
- Metrics
- Get ADLS properties
- PrivaceraCloud administration
- About the Account page on PrivaceraCloud
- Statistics on PrivaceraCloud
- PrivaceraCloud dashboard
- Event notifications for system health
- Metrics
- Usage statistics on PrivaceraCloud
- Update PrivaceraCloud account info
- Manage PrivaceraCloud accounts
- Create and manage IP addresses on PrivaceraCloud
- Scripts for AWS CLI or Azure CLI for managing connected applications
- Add UserInfo in S3 Requests sent via Data Access Server on PrivaceraCloud
- Previews
- PrivaceraCloud previews
- Preview: Scan Electronic Health Records with NER Model
- Preview: File Explorer for GCS
- Preview: File Explorer for Azure
- Preview: OneLogin setup for SAML-SSO
- Preview: File Explorer for AWS S3
- Preview: PingFederate UserSync
- Preview: Azure Active Directory SCIM Server UserSync
- Preview: OneLogin UserSync
- Privacera UserSync Configuration
- Privacera Platform previews
- Preview: AlloyDB connector for PolicySync
- Configure AWS Lake Formation on Privacera Platform
- Get started with AWS Lake Formation
- Create IAM Role for AWS Lake Formation connector for Platform
- Configure AWS Lake Formation connector on Privacera Platform
- Create AWS Lake Formation connectors for multiple AWS regions for Platform
- Setup audit logs for AWS Lake Formation on Platform
- How to validate a AWS Lake Formation connector
- AWS Lake Formation FAQs for Pull mode
- AWS Lake Formation FAQs for Push mode
- AWS Lake Formation Connector Properties
- PrivaceraCloud previews
- Release documentation
- Previous versions of Privacera Platform documentation
- PrivaceraCloud Release Notes
- Privacera Platform Release Notes
- Privacera documentation changelog
- For PrivaceraCloud 7.9 release, 2023-05-10
- For Privacera Platform 7.8 release, 2023-05-09
- For PrivaceraCloud 7.8 release, 2023-03-12
- For PrivaceraCloud 7.7 release, 2023-03-14
- For PrivaceraCloud 7.6 release, 2023-02-13
- For PrivaceraCloud 7.5 release, 2023-02-07
- For Privacera Platform 7.5 release 2023-02-07
- Privacera system security initiatives
- Privacera Platform administration
EMR user guide for Privacera Platform
Create bucket
${SECURE_BUCKET_NAME}
which you need to protect.Download sample data from the following link and add it to your bucket at location
s3://${SECURE_BUCKET_NAME}/sample_data/customer_data
.wget https://privacera-demo.s3.amazonaws.com/data/uploads/customer_data_clear/customer_data_without_header.csv
Make sure cluster should not have direct access on
${SECURE_BUCKET_NAME}
bucket.To verify, run the following commands:
ssh -i ${KEY_FILE} hadoop@${EMR_PUBLIC_DNS} aws s3 ls s3://${SECURE_BUCKET_NAME}
Fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
Hive
Run the below in beeline, using an admin user who has permission on url in Ranger and also has permission to create tables and databases:
beeline -u "jdbc:hive2://`hostname -f`:10000/default;principal=hive/`hostname -f`@${REALM}"
Create the table using this user, by running the following command in Hive.
create database if not exists customer; use customer; CREATE EXTERNAL TABLE if not exists `customer_data_s3`( `id` string, `global_id` string, `name` string, `ssn` string, `email_address` string, `address` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3a://${SECURE_BUCKET_NAME}/sample_data/customer_data';
Exit from beeline.
Switch to
${TEST_USER}
and kinit and try sample policy.beeline -u "jdbc:hive2://`hostname -f`:10000/default;principal=hive/`hostname -f`@${REALM}" #Check ranger audit for hive service Select * from customer.customer_data_s3 LIMIT 10;
Data_Admin access
Prerequisites
To create a view using the Hive Plug-In, you need the
DATA_ADMIN
permission in Ranger.The source table on which you are going to create a view requires the
DATA_ADMIN
Ranger policy.
Use case
This use case starts with an employee_db
database containing two tables with the following data:
#Requires create privilege on the database enabled by default; create database if not exists employee_db;
Create two tables.
#Requires create privilege on the table level; create table if not exists employee_db.employee_data(id int,userid string,country string); create table if not exists employee_db.country_region(country string,region string);
Insert test data:
#Requires update privilege on the table level; insert into employee_db.country_region values ('US','NA'), ('CA','NA'), ('UK','UK'), ('DE','EU'), (','EU'); insert into employee_db.employee_data values (1,'james','US'),(2,'john','US'), (3,'mark','UK'), (4,'sally-sales','UK'),(5,'sally','DE'), (6,'emily','FR'DE');
Create the following policy:
Policy Type: Access Policy
Name: Create View Access
Database:
employee_db
Table:
employee_data
,country_region
Column: *
Select User: emily
Permissions: select, update, Create
Run queries on the data:
SELECT * FROM employee_db.country_region; #Requires select privilege on the column level; SELECT * FROM employee_db.employee_data; #Requires select privilege on the column level;
Create the following view:
createviewemployee_db.employee_region(userid,region)asselecte.userid,cr.regionfromemployee_db.employee_datae,employee_db.country_regioncrwheree.country=cr.country;
Note
Granting Data_admin privileges on the resource implicitly grants Select privilege on the same resource as well.
Run the same queries as in Step 3. You will see an error message like the following:
Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [emily] does not have [DATA_ADMIN] privilege on [employee_db/employee_data](state=42000,code=40000)
Now create the following policy.
Policy Type: Access Policy
Name: Create View Access
Database:
employee_db
Table:
employee_region
Column: *
Select Group: group_privacera_dev
Permissions: select, Create
Execute the queries from Step 3. They should execute properly.
Alter view
create view if not exists employee_db.employee_region(userid,region) as select e.userid, cr.region from employee_db.employee_data e, employee_db.country_region cr where e.country = cr.country;
#Requires Create permission on the view; ALTER VIEW employee_db.employee_region AS select e.userid, cr.region from employee_db.employee_data e,employee_db.country_region cr where e.country=cr.country;
Rename view
#Requires Alter permission on the view; ALTER VIEW employee_db.employee_region RENAME to employee_db.employee_region_renamed;
Drop view
#Requires Drop permission on the view; DROP VIEW employee_db.employee_region_renamed;
Row level filter
SELECT * FROM employee_db.employee_region;
Column masking
SELECT * FROM employee_db.employee_region;
PrestoDB
SSH to EMR on master node.
Start Presto shell (presto, spark-thrift, hive all three using same metastore) by entering one of the following commands:
presto-cli --catalog hive
/usr/lib/presto/bin/presto-cli-0.210-executable --server localhost:8889 --catalog hive --schema default
Attempt the following use case as a test.
CREATE SCHEMA customer WITH (location='s3a://${SECURE_BUCKETNAME}/presto_data/customer/'); USEcustomer; CREATE TABLE cust_data( EMP_SSNvarchar, CC varchar, FIRST_NAME varchar, LAST_NAME varchar, ADDRESS varchar, ZIPCODE varchar, EMAIL varchar, US_PHONE_FORMATTED varchar); INSERT INTO cust_data values ('12345','6789','Will','Smith','US','400098','ws@gmail.com','010-564-333'); SELECT * FROM cust_data;
Full Table Access.
#Add policy in ranger to access everything in the table SELECT * FROM cust_data;
Restricted Column Access.
#Column level permission in table. If User doesn't have permission to “first_name” column #Will be denied in audit select first_name from cust_data; #Will be allowed in audit select last_name, address, zipcode, email from cust_data;
Enable additional operations on Hive catalog by updating
hive.properties
. By default, PrestoDB blocks the operations. For more information, see Hive Security Configuration.Edit
hive.properties
.sudo vi /etc/presto/conf/catalog/hive.properties
Add the following properties:
connector.name=hive-hadoop2 hive.allow-drop-table=true hive.allow-rename-table=true hive.allow-add-column=true hive.allow-rename-column=true hive.allow-drop-column=true hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml hive.s3-file-system-type=EMRFS hive.hdfs.impersonation.enabled=false
Note
The
hive.properties
file needs to be updated on all the EMR nodes.Restart Presto.
sudo systemctl restart presto-server
Alternatively, you can include the properties while creating the EMR itself in the CloudFormation Template. Below is the sample JSON:
{"Classification":"presto-connector-hive","ConfigurationProperties":{"hive.metastore":"glue","hive.allow-drop-table":"true","hive.allow-add-column":"true","hive.allow-rename-column":"true","connector.name":"hive-hadoop2","hive.config.resources":"/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml","hive.s3-file-system-type":"EMRFS","hive.hdfs.impersonation.enabled":"false","hive.allow-drop-column":"true","hive.allow-rename-table":"true"}}
PrestoSQL
Start PrestoSQL shell.
presto-cli --catalog hive
Create the schema with an admin/superuser.
CREATE SCHEMA customer WITH (location='s3a://${SECURE_BUCKETNAME}/presto_data/schema/customer’); USE customer;
Create the table using admin/superuser.
USE customer; CREATE TABLE customer_data( id varchar, name varchar, ssn varchar, email_address varchar, address varchar) WITH( format='textfile', external_location='s3a://${SECURE_BUCKETNAME}/presto_data/table/customer_data' );
Exit the Presto-CLI and switch to
{TEST_USER}
, thenkinit
and try a sample policy.presto-cli --catalog hive use customer; SELECT * FROM customer_data LIMIT 10;
Data_Admin Access
Prerequisites
To create a view using the Presto SQL Plug-In, you need the
DATA_ADMIN
permission in Ranger.The source table on which you are going to create a view requires the
DATA_ADMIN
Ranger policy.
Use Case
Create a database called employee_db
with two tables containing the following data:
#Requires create privilege on the database enabled by default; create schema if not exists employee_db;
In Privacera Portal select Access Management, then from the list of resource policy groups select privacera_hive which is under SQL.
Add a new policy:
Policy Type: Access
Policy Name: Employee Schema Create Permission
Database:
employee_db
Table:
*
Column:
*
Select User:
presto
Permissions:
Create
Click SAVE.
Create two tables.
#Requires create privilege on the table level; CREATE TABLE IF NOT EXISTS employee_db.employee_data(id int, userid string, country string); CREATE TABLE IF NOT EXISTS employee_db.country_region(country string, region string);
In Privacera Portal, create the following policy:
Policy Type: Access
Policy Name: Employee Table Create Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Select User:
presto
Permissions:
Create
Insert test data.
#Requires update privilege on the table level; insert into employee_db.country_region values ('US','NA'), ('CA','NA'), ('UK','UK'), ('DE','EU'), ('FR','EU'); insert into employee_db.employee_data values (1,'james','US'),(2,'john','US'), (3,'mark','UK'), (4,'sally-sales','UK'),(5,'sally','DE'), (6,'emily','DE');
In Privacera Portal, create the following policy:
Policy Type: Access
Policy Name: Employees Table Update Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Select User:
presto
Permissions:
update
,Create
Create a view for the previoustwo tables created; you will get an error:
Query 20210223_051227_00005_nyxtw failed: Access Denied: Cannot create view tbl_view_5
You need
Create View
permission.Create the following policy:
Policy Type: Access
Policy Name: Employees Create View Permission
Database:
employee_db
Table:
tbl_view_1
Select User:
presto
Permissions:
Create
After granting
Create View
permission, the query will result in the following error message:Query 20210223_050930_00004_nyxtw failed: Access Denied: User [emily] does not have [DATA_ADMIN] privilege on [hive/employee_db/employee_data]
You need to grant
Data_Admin
permission for both tables.Create the following policy:
Policy Type: Access
Policy Name: Employees Create View Permission Data_admin
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Select User:
presto
Permissions:
update
,Create
,Data_admin
Note
Granting
Data_admin
privileges on the resource implicitly grants Select privilege on the same resource as well.Run the query again. It should be successful.
Alter view
Create view
presto:customer> create view tbl_view_1 as SELECT * FROM tbl_1; CREATE VIEW presto:customer> SELECT * FROM tbl_view_1; c0 | c1 | c2 | c3 | c4 ----+--------+-----------+-----------------------+------------------------ 2 | James | 892821225 | james@walt.com | 4578 Extension xxx 1 | Dennis | 619821225 | thomasashley@walt.com | 9478 Anthony Extension 3 | Sally | 092341225 | sally@walt.com | 5678 Extension xyxx (3 rows) Query 20210303_142252_00006_g76nu, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 1.86 [3 rows, 169B] [1 rows/s, 91B/s]
Alter view
presto:customer> CREATE OR REPLACE VIEW tbl_view_1 as SELECT * FROM tbl_3; CREATE VIEW presto:customer> SELECT * FROM tbl_view_1; slno | name | mobile | email | address ------+------+--------+---------+--------- 1 | emily | 1234 | s@s.com | in (1 row) Query 20210303_142341_00009_g76nu, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.91 [1 rows, 0B] [1 rows/s, 0B/s]
Rename view
presto:customer> alter view tbl_view_1 rename to tbl_view_2; RENAME VIEW presto:customer>
Drop view
presto:customer> drop view tbl_view_1; DROP VIEW presto:customer>
Row level filter
presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_060602_00022_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 4.11 [6 rows, 0B] [1 rows/s, 0B/s]
In Privacera Portal, set the Policy Detail:
Policy Type: Row Level Filter
Policy Name: Employee Row Level Filter
Hive Database:
employee_db
Hive Table:
employee_data
,country_region
,tbl_1
Column:
*
Under Row Level Conditions:
Select User:
presto
Permissions:
select
Row Level Filter:
country='US'
presto:employee_db> presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+--------+--------- 1 | james | US 2 | john | US (2 rows) Query 20210309_061202_00024_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.45 [6 rows, 0B] [13 rows/s, 0B/s]
Column masking
presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_062000_00027_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.30 [6 rows, 0B] [20 rows/s, 0B/s]
In Privacera Portal, the Policy Detail:
Policy Type: Masking
Policy Name: Employee Column Level Masking
Hive Database:
employee_db
Hive Table:
employee_data
,country_region
Hive Column:
tbl_1
Under Masking Conditions:
Select User:
presto
Permissions:
select
Select Masking Option:
Nullify
presto:employee_db> presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | NULL 2 | john | NULL 3 | mark | NULL 4 | sally-sales | NULL 5 | sally | NULL 6 | emily | NULL (6 rows) Query 20210309_061856_00026_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.32 [6 rows, 0B] [18 rows/s, 0B/s]
Access views in AWS Athena
Use the following steps to provide access for views created in AWS Athena. As a result, you will be able to query the views.
Copy the Hive catalog properties (or create a symlink) as
awsdatacatalog.properties
in/etc/presto/conf/catalog
folder.xln -s /etc/presto/conf/catalog/hive.properties /etc/presto/conf/catalog/awsdatacatalog.properties
Restart the Presto server.
sudo systemctl restart presto-server
In Access Management > Resource Policies, update the
privacera_hive
default policy.Edit
all - database, table
policy.In Select User, add 'Presto' from the dropdown as the default view owner, and save.
(Optional) To change the default view owner from 'Presto' to any other owner such as 'Hadoop':
In the
access-control.properties
file, add the owner to theranger.policy.authorization.viewowner.default
variable.vi /usr/lib/presto/etc/access-control.properties ranger.policy.authorization.viewowner.default=<view-owner>
Restart the Presto server.
sudo systemctl restart presto-server
Update the owner in the all - database, table policy of the privacera_hive service.
Configure Hive policy authentication
When the Privacera Plugin is deployed in your PrestoSQL server, the HIVE_POLICY_AUTHZ_ENABLED
is set to true by default, allowing you to configure Hive policy authorization.
You can enable or disable the authorization in your PrestoSQL server. To configure, do the following:
Go to the Ranger PrestoSQL config folder.
cd /opt/privacera/plugin/ranger-x-x-x-x-presto-plugin
Run the following command:
vi install.properties
Add or edit the following property. By default, the value is set to true.
HIVE_POLICY_AUTHZ_ENABLED=true
Run the following command:
./enable-presto-plugin.sh
Restart the PrestoSQL server.
sudo systemctl restart presto-server
Trino
Start Trino shell.
trino-cli --catalog hive
Create the schema using admin/superuser.
CREATE SCHEMA customer WITH (location = 's3a://${SECURE_BUCKETNAME}/trino_data/schema/customer’); use customer;
Create the table using admin/superuser.
use customer; CREATE TABLE customer_data( id varchar, name varchar, ssn varchar, email_address varchar, address varchar) WITH ( format = 'textfile', external_location = 's3a://${SECURE_BUCKETNAME}/trino_data/table/customer_data' );
Exit from Trino-CLI and switch to
{TEST_USER}
and kinit and try sample policy.trino-cli --catalog hive use customer; SELECT * FROM customer_data LIMIT 10;
Data_Admin access
Prerequisites
You need
DATA_ADMIN
permission in RangerThe source table requires the
DATA_ADMIN
Ranger policy.
Use case
You have the employee_db
database with two tables containing the following data:
#Requires create privilege on the database enabled by default; create schema if not exists employee_db;
In Privacera Portal select Access Management, then from the list of resource policy groups select privacera_hive which is under SQL. Then click +ADD NEW POLICY.
For the Policy Detail:
Policy Type: Access
Policy Name: Employees Schema Create Permission
Database:
employee_db
Table:
*
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
Create
Click SAVE.
Create two tables.
#Requires create privilege on the table level; CREATE TABLE IF NOT EXISTS employee_db.employee_data(id int, userid string, country string); CREATE TABLE IF NOT EXISTS employee_db.country_region(country string, region string);
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Table Create Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
Create
Insert test data.
#Requires update privilege on the table level; insert into employee_db.country_region values ('US','NA'), ('CA','NA'), ('UK','UK'), ('DE','EU'), ('FR','EU'); insert into employee_db.employee_data values (1,'james','US'),(2,'john','US'), (3,'mark','UK'), (4,'sally-sales','UK'),(5,'sally','DE'), (6,'emily','DE');
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Table Insert Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
update
,Create
Create a view of above two tables created. You will get an ERROR like the following:
Query 20210223_051227_00005_nyxtw failed: Access Denied: Cannot create view tbl_view_5
You need
Create View
permission.In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Create View Permission
Database:
employee_db
Table:
tbl_view_1
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
Create
After granting
Create View
permission, the query will return the following error message:Query 20210223_050930_00004_nyxtw failed: Access Denied: User [emily] does not have [DATA_ADMIN] privilege on [hive/employee_db/employee_data]
You need to grant
Data_Admin
permission for both the tables as mentioned below and execute the create view query again.In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Create View Permission - Data_admin
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
update
,Create
,Data_admin
Note
Granting Data_admin privileges on the resource implicitly grants Select privilege on the same resource as well.
Alter view
Create view
trino:customer> create view tbl_view_1 as SELECT * FROM tbl_1; CREATE VIEW trino:customer> SELECT * FROM tbl_view_1; c0 | c1 | c2 | c3 | c4 ----+--------+-----------+-----------------------+------------------------ 2 | James | 892821225 | james@walt.com | 4578 Extension xxx 1 | Dennis | 619821225 | thomasashley@walt.com | 9478 Anthony Extension 3 | Sally | 092341225 | sally@walt.com | 5678 Extension xyxx (3 rows) Query 20210303_142252_00006_g76nu, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 1.86 [3 rows, 169B] [1 rows/s, 91B/s]
Alter view
trino:customer> CREATE OR REPLACE VIEW tbl_view_1 as SELECT * FROM tbl_3; CREATE VIEW trino:customer> SELECT * FROM tbl_view_1; slno | name | mobile | email | address ------+------+--------+---------+--------- 1 | emily | 1234 | s@s.com | in (1 row) Query 20210303_142341_00009_g76nu, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.91 [1 rows, 0B] [1 rows/s, 0B/s]
Rename view
trino:customer> alter view tbl_view_1 rename to tbl_view_2; RENAME VIEW trino:customer>
Drop view
trino:customer> drop view tbl_view_1; DROP VIEW trino:customer>
Row level filter
trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_060602_00022_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 4.11 [6 rows, 0B] [1 rows/s, 0B/s]
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Row Level Filter
Policy Name: Employee Row Level Filter by Country
Hive Database:
employee_db
Hive Table:
tbl_view_1
Under Row Level Conditions:
Select User:
trino
Permissions:
select
Row Level Filter:
country=US
trino:employee_db> trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+--------+--------- 1 | james | US 2 | john | US (2 rows) Query 20210309_061202_00024_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.45 [6 rows, 0B] [13 rows/s, 0B/s]
Column masking
trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_062000_00027_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.30 [6 rows, 0B] [20 rows/s, 0B/s]
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Masking
Policy Name: Employees Columns Masking Country
Hive Database:
employee_db
Hive Table:
tbl_view_1
Hive Column:
country
Under Masking Conditions:
Select User:
trino
Permissions:
select
Select Masking Option:
Nullify
trino:employee_db> trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | NULL 2 | john | NULL 3 | mark | NULL 4 | sally-sales | NULL 5 | sally | NULL 6 | emily | NULL (6 rows) Query 20210309_061856_00026_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.32 [6 rows, 0B] [18 rows/s, 0B/s]
Access views in AWS Athena
Use the following steps to provide access for views created in AWS Athena. As a result, you will be able to query the views.
Copy the Hive catalog properties (or create a symlink) as
awsdatacatalog.properties
in /etc/trino/conf/catalog folder.ln -s /etc/trino/conf/catalog/hive.properties /etc/trino/conf/catalog/awsdatacatalog.properties
Restart the Trino server.
sudo systemctl restart trino-server
In Access Management > Resource Policies, update the
privacera_hive
default policy.Edit
all - database, table
policy.In Select User, add 'Trino' from the dropdown as the default view owner, and save.
(Optional) To change the default view owner from 'Trino' to any other owner such as 'Hadoop', do the following:
In the
access-control.properties
file, add the owner to theranger.policy.authorization.viewowner.default
variable.vi/usr/lib/trino/etc/access-control.propertiesranger.policy.authorization.viewowner.default=<view-owner>
Restart the Trino server.
sudo systemctl restart trino-server
Accordingly, update the owner in the all - database, table policy of the privacera_hive service.
Hue
SSH to the master node.
Edit the
hue.ini
file.sudo vi /etc/hue/conf/hue.ini
For PrestoDB
In the interpreters > presto section, set the user to empty (
""
) so that it uses the credentials of a Hue logged-in user for authorization.[[interpreters]] [[[presto]]] interface = jdbc name = Presto options = '{"url": "jdbc:presto://${master_node_dns}:8889/hive/default", "driver": "com.facebook.presto.jdbc.PrestoDriver", "user":"","password":""}'
For PrestoSQL
In the interpreters > presto section, set the user to empty (
""
) so that it uses the credentials of a Hue logged-in user for authorization.[[interpreters]] [[[presto]]] interface = jdbc name = Presto options = '{"url": "jdbc:presto://${master_node_dns}:8889/hive/default", "driver": "io.prestosql.jdbc.PrestoDriver", "user":"","password":""}'
For Trino
In the interpreters > trino section, set the user to empty (
""
) so that it uses the credentials of a Hue logged-in user for authorization.[[interpreters]] [[[trino]]] interface = jdbc name = Trino options = '{"url": "jdbc:trino://${master_node_dns}:8889/hive/default", "driver": "io.trino.jdbc.TrinoDriver", "user":"","password":""}'
For SparkSQL
In the spark section, replace
sql_server_host
with the DNS name of the EMR master node.[spark] sql_server_host=${master_node_dns}
Restart the Hue service.
sudo systemctl restart hue.service
Login to Hue console through
/<master-node>:8888
Set the Admin username and password.
Add more Hue users through the Admin console.
Logout and login using the newly created user in Hue console.
Access the tables through Hive/Presto.
Check in Privacera Ranger, to ensure username is the same as the user logged in to Hue.
Livy
Setup Livy and Zeppelin.
SSH with port forwarding or open
8890
port to access Zeppelin from the web browser.ssh -i ${KEY_FILE} -L 8890:localhost:8890 hadoop@${EMR_PUBLIC_DNS}
Go to Zeppelin web UI (
http://localhost:8890
)Enable the user based login (https://zeppelin.apache.org/docs/0.6.2/security/shiroauthentication.html).
sudo su cp /etc/zeppelin/conf/zeppelin-site.xml.template /etc/zeppelin/conf/zeppelin-site.xml chown zeppelin:zeppelin /etc/zeppelin/conf/zeppelin-site.xml vi /etc/zeppelin/conf/zeppelin-site.xml #Change the property, if exists #This property removed from zeppelin 0.9.0 (https://issues.apache.org/jira/browse/ZEPPELIN-4489) zeppelin.anonymous.allowed=false cp /etc/zeppelin/conf/shiro.ini.template /etc/zeppelin/conf/shiro.ini vi /etc/zeppelin/conf/shiro.ini #Add required users in [users] as below -- [users] hadoop = hadoop123, admin chown zeppelin:zeppelin /etc/zeppelin/conf/shiro.ini
Check Livy port using the following command.
vi /etc/livy/conf/livy.conf livy.server.port=8998
Stop and restart Zeppelin.
sudo stop zeppelin sudo start zeppelin
Go to
/<master-node>:8890
. Login with the required username/password which you have created in step 3.Go to Settings > Interpreter > Livy > Edit and perform the following steps:
Keep only Scope with per user.
Set the following properties:
livy.spark.driver.cores=1
livy.spark.driver.memory=1g
livy.spark.executor.cores=1
livy.spark.executor.instances=2
livy.spark.executor.memory=1g
livy.spark.driver.extraClassPath=/opt/privacera/plugin/privacera-spark-plugin/spark-plugin/*:{copy spark.driver.extraClassPath from /etc/spark/conf/spark-defaults.conf}
Save and restart.
Run the sample Livy Spark code.
Go to Zeppelin web UI (
http://localhost:8890
).Create a new notebook using the following command:
%livy.spark val df =spark.read.csv("s3://${SECURE_BUCKET_NAME}/sample_data/customer_data/customer_data_without_header.csv"); df.show()
Check audit for the above executed command in Privacera Access Manager:
On the Privacera Portal home page, expand Access Management.
On the left menu, click Audit.
The Audit page will be displayed with Ranger Audit details.
Spark Object-Level Access Control (OLAC)
To enable Spark OLAC:
Submit Spark applications
You can submit an application consisting of compiled and packaged Java or Spark JAR. You can deploy the JAR locally (client) or cluster.
Client Mode
SSH to the master node.
Run the following command:
spark-submit \ --master yarn \ --driver-memory 512m \ --executor-memory 512m \ --class <clas-to-run> <your-jar> <arg1> <arg2>
Cluster Mode
SSH to the master node.
Run the following command:
spark-submit \ --master yarn \ --deploy-mode cluster \ --driver-memory 512m \ --executor-memory 512m \ --driver-class-path "/opt/privacera/plugin/privacera-spark-plugin/spark-plugin/*:<copy spark.driver.extraClassPath from /etc/spark/conf/spark-defaults.conf>"\ --class <clas-to-run> <your-jar> <arg1> <arg2>
Spark Fine-grained Access Control (FGAC)
View-level access
To enable the view-level access control:
SSH to the master node of EMR cluster.
Edit the
spark-defaults.conf
file.sudo vim /etc/spark/conf/spark-defaults.conf
Add the following property.
spark.hadoop.privacera.spark.view.levelmaskingrowfilter.extension.enable true