Access Management for EMR cluster¶

Introduction¶

Privacera offers a robust access control solution for Amazon EMR clusters, empowering users to define and enforce Fine-Grained Access Control (FGAC) policies across Spark, Hive, and Trino, as well as Object-Level Access Control (OLAC) specifically for Spark.

Connector Details¶

Topics	Details
Integration methodology	Apache Ranger Plugin
Access Tools	Spark OLAC pyspark spark-shell spark-submit Hive beeline Trino trino-cli Others Hue Livy
Supported User Identities for Policies	LDAP/AD/SCIM Users LDAP/AD/SCIM Groups Privacera Roles
Data Source User Identities	Kerberos User JWT (only for Spark)

Supported Access Management Features¶

Feature	Spark OLAC	Spark OLAC_FGAC	Hive FGAC	Trino FGAC
Object Level Access Control	Yes	Yes	No	No
Database Level Access Control	No	Yes	Yes	Yes
Table Access Control	No	Yes	Yes	Yes
View Access Control	No	Yes	Yes	Yes
Column Access Control	No	Yes	Yes	Yes
Row Access Control	No	Yes	Yes	Yes
Dynamic Column Data Masking	No	Yes	Yes	Yes
Dynamic Column Data Encryption	No	No	Yes	Yes
Centralized Access Audit	No	Yes	Yes	Yes
Granular Access Audit Record	No	Yes	Yes	Yes

Supported Runtime Versions¶

Privacera supports the following EMR runtime versions:

PCloudSelf Managed/Data Plane

AWS EMR Version	Privacera Release Version	End-of-support date
AWS EMR 6.15.0	9.1.0.1	January 25, 2026

AWS EMR Version	Privacera Release Version	End-of-support date
AWS EMR 7.8.0	9.0.19.1	March 7, 2027
AWS EMR 7.7.0	9.0.17.1	February 6, 2027
AWS EMR 7.6.0	9.0.17.1	January 10, 2027
AWS EMR 7.5.0	9.0.8.1	November 21, 2026
AWS EMR 7.4.0	9.0.20.1	November 13, 2026
AWS EMR 7.3.0	9.0.20.1	October 16, 2026
AWS EMR 7.2.0	9.0.1.1	July 25, 2026
AWS EMR 6.15.0	8.7.1.1	January 25, 2026

Limitations for Access Management Features¶

To enforce access control policies in Privacera, Kerberos is required.
JWT is supported for only Spark Plugin.

How it Works¶

The Privacera integrates with EMR clusters through the Apache Ranger plugin. The plugin is deployed during the creation of EMR clusters as part of the Spark, Hive, or Trino processes via bootstrap actions. The Apache Ranger plugin retrieves policies from the Privacera Policy Server and enforces them by intercepting and evaluating user queries in real-time. Additionally, any Attribute-based access control (ABAC) and Tag-based policies configured in Privacera are enforced by the Apache Ranger plugin at runtime.

User Identity Mapping¶

Policies in Privacera are configured for users and groups based on Kerberos or JWT, as well as for roles created within Privacera. These identities are mapped to the Databricks user identities as follows:

Privacera Identity	EMR Identity
LDAP/AD/SCIM User	Kerberos User / JWT
LDAP/AD/SCIM Group	N/A
Privacera Role	N/A

The Apache Ranger plugin, which runs as part of the Databricks Spark process maps the user's email address to the corresponding AD/SCIM user. The groups and roles associated with the user are dynamically fetched from Privacera and are used to enforce group and role-based policies within the Databricks clusters.

Prev topic: About AWS EMR
Next topic: Connector Guide