Access Management for EMR cluster¶
Introduction¶
Privacera offers a robust access control solution for Amazon EMR clusters, empowering users to define and enforce Fine-Grained Access Control (FGAC) policies across Spark, Hive, and Trino, as well as Object-Level Access Control (OLAC) specifically for Spark.
Connector Details¶
Topics | Details |
---|---|
Integration methodology | Apache Ranger Plugin |
Access Tools |
|
Supported User Identities for Policies |
|
Data Source User Identities |
|
Supported Access Management Features¶
Feature | Spark OLAC | Hive FGAC | Trino FGAC |
---|---|---|---|
Object Level Access Control | Yes | No | No |
Database Level Access Control | No | Yes | Yes |
Table Access Control | No | Yes | Yes |
View Access Control | No | Yes | Yes |
Column Access Control | No | Yes | Yes |
Row Access Control | No | Yes | Yes |
Dynamic Column Data Masking | No | Yes | Yes |
Dynamic Column Data Encryption | No | Yes | Yes |
Centralized Access Audit | No | Yes | Yes |
Granular Access Audit Record | No | Yes | Yes |
Limitations for Access Management Features¶
- To enforce access control policies in Privacera, Kerberos is required.
- JWT is supported for only Spark Plugin.
How it Works¶
The Privacera integrates with EMR clusters through the Apache Ranger plugin. The plugin is deployed during the creation of EMR clusters as part of the Spark, Hive, or Trino processes via bootstrap actions. The Apache Ranger plugin retrieves policies from the Privacera Policy Server and enforces them by intercepting and evaluating user queries in real-time. Additionally, any Attribute-based access control (ABAC) and Tag-based policies configured in Privacera are enforced by the Apache Ranger plugin at runtime.
User Identity Mapping¶
Policies in Privacera are configured for users and groups based on Kerberos or JWT, as well as for roles created within Privacera. These identities are mapped to the Databricks user identities as follows:
Privacera Identity | EMR Identity |
---|---|
LDAP/AD/SCIM User | Kerberos User / JWT |
LDAP/AD/SCIM Group | N/A |
Privacera Role | N/A |
The Apache Ranger plugin, which runs as part of the Databricks Spark process maps the user's email address to the corresponding AD/SCIM user. The groups and roles associated with the user are dynamically fetched from Privacera and are used to enforce group and role-based policies within the Databricks clusters.
- Prev topic: About AWS EMR
- Next topic: Prerequisites