Release 9.0.1.1¶

This is the Rolling Release Notes for the Release 9.0.1.1. This release notes is applicable only to Privacera's Self Managed version.

Breaking Changes¶

PostgreSQL as an External Database

PostgreSQL as an External Database¶

You can ignore this warning if you're already using PostgreSQL 9.5 or above or if you're using a different database.

If you're using PostgreSQL 9.4 or below with Privacera, it's essential to upgrade your PostgreSQL instance to version 9.5 or above before proceeding with any Privacera upgrade. This is because Privacera supports only PostgreSQL 9.5 and higher, and continuing to use an unsupported version could result in compatibility issues or instability. Ensure that your PostgreSQL database is upgraded and fully operational at a higher version before upgrading Privacera for a smooth transition.

PolicySync Connector Updates¶

Introducing Support for BigQuery Column-Level Security with Native Column Masking Options

Introducing Support for BigQuery Column-Level Security with Native Column Masking Options¶

Update: BigQuery Native Masking Options and Configuration Updates
Details: With this release, the BigQuery connector supports a defined set of native masking options, as well as some behavior changes compared to other connectors. The following masking options are now available for column-level security:
- Nullify
- Hash
- Partial Mask (Show first 4 characters)
- Partial Mask (Show last 4 characters)
Additionally, the end-user access behavior has been updated. Users needing access to columns with tag-based masking policies in BigQuery must either be included in the tag-based masking policy or have permission through a tag-based access policy. This change ensures stricter security controls for sensitive data.
For more information on how column-level security is managed in BigQuery, please refer to the official documentation here.
Benefits:
- Improved Data Security: The new native masking options provide more flexibility in how sensitive data is exposed while maintaining strict security policies.
- Stronger Access Controls: The behavior change requiring users to be part of tag-based masking or access policies ensures that only authorized users can view or manipulate sensitive columns.
- Customizable Taxonomies: By supporting manual taxonomy creation, users have more control over how their data is categorized and secured.
Limitations:
- Limited Multi-Location Support: Currently, the connector only manages resources in a US location, which may require additional planning for users managing resources across multiple locations.
- Limited Multi-Project Support: Currently, the connector only manages resources in a single project, which may require additional planning for users managing resources across multiple projects.
- Wildcards are not supported in resource names: When creating a tag-resource mapping in the Privacera portal, ensure that the resource name does not contain any wildcards.
Example:
- Valid: Table name: customer_data
- Invalid: Table name: customer_*
Tag with the same name in multiple taxonomies is not supported: When creating tags in the Privacera portal, make sure that a tag is not duplicated across different taxonomies.
BigQuery’s limitation: A single column cannot have multiple tags assigned to it: When creating a tag-resource mapping in the Privacera portal, ensure that each column is associated with only one tag.

For further information on managing resources and taxonomies, please visit the BigQuery column-level security documentation.

Enhanced Support for Google BigQuery Row-Level Filtering with “TRUE Filter”

Enhanced Support for Google BigQuery Row-Level Filtering with “TRUE Filter”¶

Update: Stability Fixes for Native Row-Level Filtering (RLF) in Google BigQuery
Details: In response to recent changes in Google BigQuery (GBQ) regarding row-level security policies, the PolicySync connector has been updated to support the new “TRUE filter” feature. This update ensures that users previously granted full access to data can retain access even when row-level security policies are applied.
As per the GBQ documentation, any new row-level security policy requires users who previously had unrestricted access to be included in a “TRUE filter” to maintain their full data access. Without this adjustment, certain users may face unintended data access restrictions.

Support for Creating BigQuery Datasets with Custom Managed Encryption Key (CMEK)

Support for Creating BigQuery Datasets with Custom Managed Encryption Key (CMEK)¶

Update: CMEK Integration for Secure Datasets in BigQuery
Details: In this release, the BigQuery connector has been enhanced to support the creation of BigQuery datasets with Custom Managed Encryption Key (CMEK). Previously, the BigQuery connector did not have the capability to create xxx_secure datasets with CMEK, resulting in the generation of security alerts each time a new secure dataset was created. These alerts required manual intervention, causing delays and increasing the operational burden on security teams.
To resolve this, Privacera will now dynamically build the CMEK path before creating any dataset. The DDL for creating BigQuery datasets has been updated to include the CMEK configuration, ensuring that all secure datasets are created with the necessary encryption key, eliminating manual security alerts.

Integration of Workload Identity for GKE with Google Cloud Services

Integration of Workload Identity for GKE with Google Cloud Services¶

Update: Support for Workload Identity in GKE Clusters for Enhanced Security
Details: This release introduces support for Workload Identity in Google Kubernetes Engine (GKE) clusters. As part of the security enhancements, Ranger admins can now configure Workload Identity, allowing Kubernetes pods to automatically obtain and use IAM credentials to access Google Cloud services without the need to manually manage service account keys. This feature simplifies credential management and increases security by eliminating the risks associated with static service account keys.

Reintroduction of External Table Support for Audit in AWS GovCloud For Unity Catalog Connector

Reintroduction of External Table Support for Audit in AWS GovCloud For Unity Catalog Connector¶

Update: External Table Support for Auditing in Databricks for AWS GovCloud for Unity Catalog Connector
Details: Starting from version 9.0 and higher, support for external tables has been reintroduced to address the limitations in the AWS GovCloud region. While system tables were previously added for auditing purposes, they are not supported in the AWS GovCloud environment. As a result, external table support has been reinstated for audit functionality, ensuring full coverage of audit requirements across all regions.
This functionality was removed when system table support was added, but due to the lack of system table availability in the AWS GovCloud, external tables are now necessary to maintain auditing capabilities.
New Configuration:
- Flag Name: CONNECTOR_DATABRICKS_UNITY_CATALOG_AUDIT_MODE
  - Default Value: simple (By default, gets Unity Catalog events and data access.)
  - Other Accepted Values
    - verbose (Gets Unity Catalog events, data access queries run on warehouses and notebooks.)
    - external-simple (Gets Unity Catalog events and data access.)
    - external-verbose (Gets Unity Catalog events, data access queries run on warehouses and notebooks.)

Support for TEMPORARY Permission on Database Objects in Redshift

Support for TEMPORARY Permission on Database Objects in Redshift¶

Update: Introduction of Temporary Table Permissions in Amazon Redshift
Details: This release adds support for the GRANT TEMPORARY ON DATABASE permission in Amazon Redshift. This permission allows users to create temporary tables with a lifetime tied to the current logged-in session. Temporary tables are essential for session-specific tasks and are also required when querying Redshift Spectrum tables.
With this update, administrators can grant users the ability to create temporary tables without affecting long-term database structure, ensuring that session-specific operations are more flexible and efficient.

Support for AWS Multi-Account Role Differentiation in Lake Formation (LF) Connector

Support for AWS Multi-Account Role Differentiation in Lake Formation (LF) Connector¶

Update: Support for Distinct Ranger Policies for Same Role Name Across Different AWS Accounts
Details: AWS Lake Formation (LF) connector has been enhanced to support distinct Ranger policies for the same IAM role name across multiple AWS accounts. Previously, the same Ranger policy was applied to roles with the same name in different AWS accounts, but this new feature allows more granular control.
Under the current Lake Formation multi-account setup, the same role name (e.g., Analyst1) across different AWS accounts (e.g., 123456789XXX, 345678912XXX) would share the same Ranger policy. This update ensures that the AWS account ID and IAM role name are treated as a unique identity, enabling administrators to create distinct Ranger policies for roles like Analyst1 in different accounts.

Support for Hive Policy Creation with Multiple Roles in Different AWS Accounts

Support for Hive Policy Creation with Multiple Roles in Different AWS Accounts¶

Update: Hive Policy Creation for Multiple Roles Across AWS Accounts
Details: This release includes enhancements to the PolicySync connector to support the creation of policies in Privacera Hive for roles that exist across multiple AWS accounts. Previously, when policies were created in Lake Formation (LF) for multiple roles in different AWS accounts, the corresponding policy was not being generated in Privacera Hive. With this update, PolicySync now ensures that policies are consistently created in Privacera Hive when multiple roles are involved, even if they reside in different AWS accounts.

New Flag to Prevent AWS Account-Specific Role Creation in Privacera Hive

New Flag to Prevent AWS Account-Specific Role Creation in Privacera Hive¶

Update: Optional Flag to Disable AWS Account-Specific Role Creation in Hive
Details: In this release, a new flag has been added to the PolicySync connector to provide the option to prevent the creation of AWS account-specific roles (e.g., aws-account=123456789XXX) in Privacera Hive. This flag is introduced to give administrators more control over how roles are created in Hive, particularly when AWS-specific role identifiers are not required or may cause unnecessary complexity.
When this flag is enabled, PolicySync will bypass the creation of roles in Hive that are formatted as aws-account=, ensuring that only essential roles are created and managed.
New Configuration:
- Flag Name: CONNECTOR_LAKEFORMATION_ENABLE_PUSH_POLICIES_EXTERNAL_ACCOUNT_ROLE
  - Default Value: false (By default, PolicySync will skip the creation of policies in Hive that include the aws-account= prefix.)
  - When Set to true: AWS account-specific roles policies are created in Hive
- Flag Name: CONNECTOR_LAKEFORMATION_ENABLE_PUSH_POLICIES_TO_RANGER
  - Default Value: false (By default, PolicySync will skip pushing lake formation policies to other policy repositories.)
  - When Set to true: Enable pushing Lake Formation policies in other policy repositories.

Support for OAuth 2.0 in Databricks Unity Catalog Connector

Support for OAuth 2.0 in Databricks Unity Catalog Connector¶

Update: OAuth 2.0 Support for REST API Integration in Databricks Unity Catalog Connector
Details: This release introduces OAuth 2.0 support for the Databricks Unity Catalog Connector to enhance security and streamline authentication when integrating with REST APIs. The connector now allows access tokens to be retrieved and automatically refreshed, ensuring continuous and secure data interactions without disruptions caused by token expiration.
Limitations:
Current Support:
- The Unity Catalog (UC) connector supports two modes: JDBC and API. However, only the JDBC mode supports the OAuth mechanism. The API mode does not currently have OAuth support.
Upgrade Scenario:

When upgrading to OAuth, if the UC connector was previously configured using a user token, and “secure” views were created with the token user’s credentials, these “secure” views must be deleted before enabling view-based masking and Row-Level Filtering (RLF) functionality. Failing to do so may result in issues with access control and security policies.

Support for OAuth 2.0 in Databricks Connector

Support for OAuth 2.0 in Databricks Connector¶

Update: OAuth 2.0 Support for REST API Integration in Databricks Connector
Details: This release introduces OAuth 2.0 support for the Databricks Connector to enhance security and streamline authentication when integrating with REST APIs. The connector now allows access tokens to be retrieved and automatically refreshed, ensuring continuous and secure data interactions without disruptions caused by token expiration.
Limitations:
Migration Scenario:
- When the connector is configured using JDBC with a username and password, it fetches all resources and adds them to the RocksDB cache, assigning ownership of these resources to the JDBC username. If you migrate from token-based authentication to OAuth and re-run the connector, the ownership of resources already stored in RocksDB will not be updated. This can lead to issues, especially with secure views, as they rely on the ownership of underlying resources such as tables.
- If this issue is not addressed, end users may encounter the following error when attempting to access a secure view: User cannot SELECT on table due to permissions on underlying securables.
Workaround:
- To resolve this, clean the RocksDB cache. This forces the connector to treat all resources as new, updating the ownership to match the current authentication method.

Support for Pod-Level Service Account IAM Role in Lake Formation Connector

Support for Pod-Level Service Account IAM Role in Lake Formation Connector¶

Update: Transition from Node-Level to Pod-Level IAM Roles for Enhanced Security in Lake Formation Connector
Details: The Lake Formation connector now supports using a Kubernetes pod-level service account IAM role instead of relying on the node-level IAM role. Previously, the connector used the IAM role associated with the Kubernetes node, which posed issues when the node-group was modified, leading to role changes that disrupted connector functionality. This update resolves those issues by assigning a specific service account to the Lake Formation connector pods, ensuring stability and independence from node-level role changes.
Key Changes:
- Pod-Level IAM Role: The connector now uses a dedicated Kubernetes service account to assume the target AWS account’s IAM role. This allows each pod to have its own IAM role, enhancing security and avoiding dependency on node-level roles.
- Stability Improvements: Previously, changes to the node-group, such as upgrades or scaling, would cause the IAM role to change, disrupting the Lake Formation connector. By using pod-level service accounts, the connector’s IAM role remains stable and unaffected by node modifications.

Update to Audit Sync Interval and Time Handling in Lake Formation Connector

Update to Audit Sync Interval and Time Handling in Lake Formation Connector¶

Update: Adjustments to Audit Sync Interval and Event Time Handling for Improved Audit Consistency
Details: This release includes updates to the default audit sync interval and the handling of event times in the Lake Formation connector. The changes aim to address issues where certain audit actions (such as Create and Drop) were recorded in the audit table but not displayed in the Privacera portal.
Key Changes:
- Updated Default Audit Sync Interval: The default value for the audit sync interval has been adjusted to ensure more frequent and consistent syncing of audit entries, reducing the likelihood of delays in audit visibility.
- Audit Time Adjustment: To prevent missing audit entries in the portal, the connector now subtracts the audit delay time from the toEventTime. This adjustment ensures that the audit sync process captures the correct range of events, minimizing the chances of missing important actions like Create and Drop.

Support for Additional Service Filters in Lake Formation Audit Queries

Support for Additional Service Filters in Lake Formation Audit Queries¶

Update: Introduction of Configurable Service Filters for Lake Formation Audit Queries
Details: This release introduces support for additional service filters in the Lake Formation audit query, allowing more granular control over the services included in the audit trail. Administrators can now specify which services (e.g., Athena, Glue, Lake Formation) should be included in the audit queries, while maintaining the default filter behavior for backward compatibility.
New Configuration Property:
- Property Name: CONNECTOR_LAKEFORMATION_AUDIT_INCLUDED_SERVICES
- Default Value: athena.amazonaws.com,glue.amazonaws.com,lakeformation.amazonaws.com
- Customizable Value: Administrators can specify additional services in this property, such as "athena.amazonaws.com, glue.amazonaws.com, lakeformation.amazonaws.com". These services correspond to the eventSource field in AWS CloudTrail events and can be used to refine the scope of the Lake Formation audit query.

Fix for Incorrect Display of Denied Audits as Allowed in Privacera Portal

Fix for Incorrect Display of Denied Audits as Allowed in Privacera Portal¶

Update: Resolution for Incorrect Display of Access Denied Audits in Privacera Portal For lake Formation Connector
Details: This release includes a fix for an issue where audit entries marked as “Access Denied” in the audit table were incorrectly displayed as “Allowed” in the Privacera portal. Although the audit table correctly logged access denials (e.g., access denied from the AWS Console), these entries were incorrectly marked as “Allowed” in the portal, causing confusion during audit reviews.
The issue has been identified and resolved, ensuring that audits reflecting denied access are now accurately displayed as “Denied” in the Privacera portal, aligning with the audit table entries.

Improved Error Handling for Invalid Values in CONNECTOR_LAKEFORMATION_AUDIT_INCLUDED_SERVICES

Improved Error Handling for Invalid Values in CONNECTOR_LAKEFORMATION_AUDIT_INCLUDED_SERVICES¶

Update: Error Logging for Invalid Service Values in Lake Formation Connector
Details: This release enhances error handling for the CONNECTOR_LAKEFORMATION_AUDIT_INCLUDED_SERVICES property. Previously, when invalid or incorrect service values were set (e.g., glue.amazonaws.c or test123), no error messages were displayed in the PolicySync logs, and audit entries were not captured, resulting in AuditLogsCount - 0 and no new audits appearing in the portal.
With this update, PolicySync will now generate appropriate error messages when invalid values are set for the CONNECTOR_LAKEFORMATION_AUDIT_INCLUDED_SERVICES property. This ensures that misconfigurations are quickly identified and resolved, preventing silent failures in the audit process.

Enhanced Logging and Timeout Configuration for AWS Glue Client In AWS Lake Formation

Enhanced Logging and Timeout Configuration for AWS Glue Client In AWS Lake Formation¶

Update: Timeout Handling and Logging for AWSGlueClient.getTables API in Lake Formation Connector
Details: This release introduces enhanced logging and configurability for the AWS Glue Client used in the Lake Formation connector, specifically addressing timeouts encountered during the com.amazonaws.services.glue.AWSGlueClient.getTables API call. The updates aim to improve transparency in debugging timeout issues and provide better control over the timeout and pagination settings.
New Configuration:
- This configuration needs to be updated in custom properties https://privacera.com/docs/en/properties-for-aws-lake-formation.html
  - Property Name: ranger.policysync.connector.0.aws.connection.timeout_ms
    - Default Value: 10000
  - Property Name: ranger.policysync.connector.0.aws.socket.timeout_ms
    - Default Value: 50000

Support for Resource Name Extraction from Audits In AWS Lake Formation

Support for Resource Name Extraction from Audits In AWS Lake Formation¶

Update: Resource Name Extraction from audits for better readability
Details: Resource names are now extracted from the audit format and converted to Privacera format for better readability.

Support for AWS Redshift Audit Normalization and Query Parsing

Update: Redshift Query Stitching and Table Name Parsing for Audit Normalization
Details: This release introduces enhancements to the Redshift audit normalization process, including the ability to stitch together queries that are stored across multiple rows in the query history table. In addition, the system can now parse these queries to identify which tables were accessed by users, and check whether the fully qualified name of the tables is present in the query history.

Fix for Unintended Warehouse Activation in Databricks Connections

Fix for Unintended Warehouse Activation in Databricks Connections¶

Update: Prevent Unintended Warehouse Activation When Fetching Databricks Connections
Details: This release includes a critical fix for an issue where a warehouse in Databricks was unintentionally being spawned in a “running” state when users were created or fetched through Databricks connections. The issue was caused by the getConnection method, which was responsible for changing the state of the warehouse from “stopped” to “running” even when the API call was made without requiring the connection.

Support for Azure Database for PostgreSQL - Flexible Server in Postgres Connector

Support for Azure Database for PostgreSQL - Flexible Server in Postgres Connector¶

Update: Postgres Connector Now Supports Azure Database for PostgreSQL - Flexible Server
Details: This release adds support for Azure Database for PostgreSQL - Flexible Server to the PolicySync Postgres Connector. With this enhancement, organizations using Azure’s managed PostgreSQL - Flexible Server can now integrate with PolicySync to manage and audit their database access and policies seamlessly.

Enhanced Privilege Control for API Token Users in Unity Catalog

Enhanced Privilege Control for API Token Users in Unity Catalog¶

Update: Support for Granular Privileges in Unity Catalog Connector for API Token Users
Details: This release introduces updates to the Unity Catalog connector to allow the assignment of specific, least-privileged permissions to API token users, rather than granting ALL PRIVILEGES by default.
New PM Configuration:
- Flag Name: CONNECTOR_DATABRICKS_UNITY_CATALOG_AVOID_TOKEN_USER_ALL_PRIVILEGES
- Default Value: true (By default, PolicySync connector will grant privileges from a predefined unmodifiable list TOKEN_USER_PRIVILEGE_LIST. i.e. ["USE_CATALOG", "CREATE_SCHEMA", "CREATE_TABLE", "USE_SCHEMA", "CREATE_FUNCTION", "MODIFY", "SELECT"] )
- When Set to false: It will apply ALL_PRIVILEGES privilege

Support for Functions and Procedures in Redshift Connector

Update: Redshift Connector Now Supports Functions and Procedures
Details: This release introduces support for Functions and Procedures in the Redshift Connector. With this update, administrators can now manage, audit, and apply policies to functions and stored procedures within Amazon Redshift, providing greater control over database objects.

Fix for Unintended Policy Reapplication for Array Type Columns In Unity Catalog Connector

Fix for Unintended Policy Reapplication for Array Type Columns In Unity Catalog Connector¶

Update: Addressing Unintended Policy Reapplication for Array Type Columns In Unity Catalog Connector
Details: This release includes a critical fix in Unity Catalog Connector for an issue where a set of policies were being automatically reapplied after a PM update, even without any policy changes or updates. The problem specifically affected masking policies on columns with array types, leading to unintended revokes and grants being executed in succession, which caused the policies to be reapplied unnecessarily.
Recommendation: It's recommended to delete connector’s existing rocksdb before the upgrade.

Plugin Updates¶

Introducing Support for EMR Serverless with OLAC Plugin

Introducing Support for EMR Serverless with OLAC Plugin¶

Update: Privacera now offers seamless integration with Amazon EMR Serverless using the OLAC plugin, further enhancing access management and data governance capabilities.

Details:

Prerequisites: Please make sure DataServer is enabled with JWT Authentication.

Next, add the following properties in vars.emr-serverless.yml:

Bash
## This file enables EMR serverless and contains variables to generate EMR serverless configurations

EMR_SERVERLESS_ENABLE: "true"

# EMR serverless version Eg: emr-7.1.0:latest
EMR_SERVERLESS_VERSION: "<PLEASE_CHANGE>"

# EMR serverless app name e.g. privacera_dev
EMR_SERVERLESS_APP_NAME: "<PLEASE_CHANGE>"

The required configuration files and Dockerfile will be generated in the PM output directory once privacera-manager post-install is done.
You will then need to build the docker image using these files and push it to the remote repository.
You can then create the EMR serverless Application using the same image.

Limitations: There is no kerberos support in EMR Serverless, so for user authorization we need to enable JWT authorization in the Spark plugin.

Wild-Card Support Added for privacera.olac.ignore.paths

Wild-Card Support Added for privacera.olac.ignore.paths¶

Update: With this release, you can now use wild-card patterns in the paths added to the ignore path list. This enhancement works alongside the existing bucket-level and folder-level paths, offering more flexibility in specifying which paths to exclude from access control.

For example:

Bash
privacera.olac.ignore.paths=s3://*/bucket/folder1/folder2
privacera.olac.ignore.paths=s3://bucket/folder1/*/folder3
privacera.olac.ignore.paths=s3://bucket*/folder1
privacera.olac.ignore.paths=s3://bucket/folder1/f*/folder3

Limitations:
- If we add a * as a wild card pattern in between the path added in ignore path list, it will consider multiple folders to match with the resource path.
- For example, if we set privacera.olac.ignore.paths=s3://bucket/folderA/*/folderD then, the access on path s3://bucket/folderA/folderB/folderC/folderD will also be bypassed.

Automatic Upload of Databricks Cluster Init Scripts and Configuration Files to S3

Update: With this release, Privacera now supports the automatic upload of required Init scripts and configuration files to an S3 location during a self-managed cluster upgrade. This simplifies the process, as Databricks clusters can read Init scripts directly from the S3 location, ensuring a smoother setup.

Details: To upload ranger_enable.sh, ranger_enable_scala.sh and privacera_custom_conf.zip to s3, we have to add below variables in vars.databricks.plugin.yml.

Bash
# To upload files to Databricks Workspace. Default: `true`.
DATABRICKS_INIT_SCRIPT_WORKSPACE_FLAG_ENABLE: "false"

# To upload the `init scripts` and `privacera_cust_conf.zip` to the S3 location.
DATABRICKS_CUST_CONF_PATH: "s3://<bucket-name>/<path>"

# To upload the plugin installation logs to the S3 location. By default, the logs will be uploaded to the path
PRIVACERA_CLUSTER_LOGS_DIR: "s3://<bucket-name>/<path>"

Limitations:
- Currently, it doesn’t work for static JWT as support to upload public key to S3 is not available through PM.
- It needs to be done manually.
- To run static JWT use cases with FGAC, run the following command in notebook, before executing any use-case:
  
  We have to add public key in PEM format only.
  Python
  1 2 3 4 5
  pub_file_path="/tmp/jwttoken.pub" public_key="""<public-key>""" file1 = open(pub_file_path,"w") file1.write(public_key) file1.close()

Support for Displaying Query Details on Audits Page for EMR Trino

Support for Displaying Query Details on Audits Page for EMR Trino¶

Update: You can now view the exact query executed via the Trino CLI directly on the Audits Page.
Details: To see the full query, navigate to the Audits Page, locate the Resource Name column, and click the Pencil icon. This will display the complete query that was executed.

Introducing Access Control Support for Apache Flink

Update: We are excited to announce that Privacera now supports Apache Flink. With this integration, Privacera ensures that any actions involving S3 objects with Apache Flink are securely monitored and controlled through Privacera's access management system.

We only support Apache Flink deployed on Kubernetes.

Details: To enable Privacera Access Control in Flink, configure the vars.flink.yml with the following properties:

Variable	Definition
FLINK_ENABLE	Set this to `true` to enable the Apache Flink. Default: `false`.
FLINK_HOME_DIR	Set the path where Apache Flink is installed. Default: `/opt/flink`
FLINK_S3_FS_PLUGIN_DIR	Set the path to S3 plugin for Apache Flink. Default: `/opt/flink/plugins/s3-fs-hadoop`
FLINK_CLUSTER_NAME	Set the name of the cluster. Default: `privacera_flink`
FLINK_STS_TOKEN_SIGNING_ENABLE	Configure to use STS token for signing the requests. Default: `true`

Limitations:
- Supports only Hadoop S3 FileSystem (s3-fs-hadoop).
- The job continues to retry when access to the file is denied from the Privacera.

Support for Iceberg Catalog in Spark with EMR and EMR Serverless

Support for Iceberg Catalog in Spark with EMR and EMR Serverless¶

Update: We have introduced a new flag and some variables that, when added to vars.emr.yml, automatically configures the necessary properties in the EMR template for setting up Iceberg catalog in Spark.

Details: Please add the following properties in the vars.emr.yml:

Variable	Definition
EMR_SPARK_ICEBERG_ENABLE	Set this to `true` to enable the Iceberg catalog for EMR Spark. Default: `false`.
EMR_SPARK_ICEBERG_CATALOG_TYPE	Set the Iceberg catalog type. Default: `hadoop`. Supported types: `hadoop`, `glue`.
EMR_SPARK_ICEBERG_CATALOG_NAME	Set the Iceberg catalog name. Default: `hadoop_catalog`. Supported types: `haddop_catalog`, `glue_catalog`.
EMR_SPARK_ICEBERG_CATALOG_WAREHOUSE_LOCATION	Set the location for the Iceberg catalog warehouse.

Redaction of Sensitive Information in Log Messages

Redaction of Sensitive Information in Log Messages¶

Update: With this release, we can now redact sensitive data displayed in the log messages for Spark Plugin and DataServer. This automatic protection enhances security and privacy, preventing the exposure of critical information in log files.

Securing Client-Server Communication by Encrypting Sensitive Information

Securing Client-Server Communication by Encrypting Sensitive Information¶

Update:
- This release introduces enhanced security measures by encrypting sensitive information in requests and responses between the SparkPlugin and DataServer. By encrypting the data, we ensure that sensitive data is protected from unauthorized access throughout the communication process.
- Currently, this feature is available exclusively in the OLAC plugin.

Details: To enable encryption of sensitive data in the requests please make the below changes. By default, these are set to false.

Bash
#set below property in vars.emr.yml for EMR
EMR_SPARK_ENCRYPT_SENSITIVE_PAYLOAD_DATA_ENABLED: "true"

#set below property in vars.databricks.plugin.yml for Databricks
DATABRICKS_SPARK_ENCRYPT_SENSITIVE_PAYLOAD_DATA_ENABLED: "true"

#set below property in vars.spark-standalone.yml for Open Source Spark
STANDALONE_SPARK_ENCRYPT_SENSITIVE_PAYLOAD_DATA_ENABLED: "true"

Support for Reading of the Public Key Properties instead of Expecting PEM Key in Case of Dynamic JWT Authorization

Support for Reading of the Public Key Properties instead of Expecting PEM Key in Case of Dynamic JWT Authorization¶

Update:
- In this release, we have enhanced our dynamic JWT authorization by allowing the use of public key properties instead of requiring a PEM key in the response from the remote server.
- You can now include the n and e components for the RSA public key, as well as the x and y coordinate values for the EC public key. These properties are read internally and utilized for verifying the JWT signature, streamlining the authorization process.
Details:
- By default, you don't need any configuration changes to be able to use these properties in the response.
- If you need to change the property key name to some other key name we can do so by including below in the vars.jwt-auth.yml file. Make sure the same property name is reflected in the server response as well.
  Bash
  1 2
  jwtTokenProviderRsaModulusKey: "PLEASE_CHANGE" jwtTokenProviderRsaExponentKey: "PLEASE_CHANGE"
  Bash
  1 2
  jwtTokenProviderEcCoordinateXKey: "PLEASE_CHANGE" jwtTokenProviderEcCoordinateYKey: "PLEASE_CHANGE"

Support for Reading Public Keys from JSON Web Key Set (JWKS)

Support for Reading Public Keys from JSON Web Key Set (JWKS)¶

Update: You can now include a JSON Web Key Set (JWKS) in the server response. This JWKS will contain properties for multiple public keys, allowing each key to be utilized for JWT authorization.

Supporting GCS as an Output Source for Audit Data Backup through Fluentd

Supporting GCS as an Output Source for Audit Data Backup through Fluentd¶

Update:
- The Privacera currently supports S3 and ADLS as output storage location for audits data written in solr by AuditServer through fluentd.
- Following this release we will also be able to use GCS as an output storage location for the audit's data.

Support to Use JWT Token Configured in HMS to Authenticate with DataServer on EMR OLAC

Support to Use JWT Token Configured in HMS to Authenticate with DataServer on EMR OLAC¶

Update: With this release we will be authenticating the JWT configured in the HMS server using the DataServer. The JWT token in HMS will now be authenticated with DataServer on startup of HMS.

Default Upload of Databricks Cluster Init Scripts to Databricks Workspace

Default Upload of Databricks Cluster Init Scripts to Databricks Workspace¶

Update: We have introduced a new flag, DATABRICKS_INIT_SCRIPT_WORKSPACE_FLAG_ENABLE, which is enabled by default in the vars.databricks.yml file. This allows for the automatic upload of Databricks cluster Init scripts to the Databricks Workspace.

Limitations in Plugin modules¶

Event Logging in EMR

Event Logging in EMR¶

Text Only
- The eventLog bucket to be ignored needs to be configured via the bootstrap action only and not in the 
  `spark-defaults.conf` or `--conf`.
- The cluster IAM role should have access to that event log bucket.

Event Logging in Databricks and OSS

Event Logging in Databricks and OSS¶

Text Only
- The complete eventLog bucket should be added in the ignore path list through the `privacera_spark.properties` file 
  only.

Discovery Updates¶

Support for CMEK in Discovery Configuration Bucket and Pub/Sub Topics

Support for CMEK in Discovery Configuration Bucket and Pub/Sub Topics¶

The Privacera now supports Customer Managed Encryption Keys (CMEK) for Google Cloud Storage bucket and Pub/Sub topics created by Discovery. This enhancement enables secure storage of configuration data and internal communications, offering users greater control over encryption keys and compliance with security requirements in Google Cloud.

GA Release: New versions of Data Sources supported in this release¶

Plugin Name	Version
DBX FGAC	15.4 LTS
DBX OLAC	15.4 LTS
EMR OLAC	7.2.0
EMR Serverless	7.2.0
OSS OLAC	3.5.1
Open Source Trino	462
Starburst Trino	453e
Apache Flink	1.81.1

Prev topic: Releases