Skip to content

Access Management for EMR Serverless

Introduction

Privacera offers a robust access control solution for Amazon EMR Serverless, empowering users to define and enforce Object-Level Access Control (OLAC) policies for Spark.

This section provides the information how to extend the Docker image from EMR Serverless to include Privacera’s plugin and configurations. Refer to Privacera's User Guide for AWS EMR Serverless for instructions to run the Apache Spark Jobs or Jupyter Notebook with Privacera's access control.

Connector Details

Topics Details
Integration methodology Privacera DataServer
Access Tools
    Spark OLAC
  • Spark Jobs
  • Jupyter Notebook
Supported User Identities for Policies
  • LDAP/AD/SCIM Users
  • LDAP/AD/SCIM Groups
  • Privacera Roles
Data Source User Identities

Supported Access Management Features

Feature Spark OLAC
🟢 Object Level Access Control Yes
🔴 Database Level Access Control No
🔴 Table Access Control No
🔴 View Access Control No
🔴 Column Access Control No
🔴 Row Access Control No
🔴 Dynamic Column Data Masking No
🔴 Dynamic Column Data Encryption No
🟢 Centralized Access Audit Yes
🟢 Granular Access Audit Record Yes

⚠Limitations for Access Management Features

  1. Only JWT is supported for user identity mapping.
  2. Only supports S3 as the data source.
  3. For now, AWS EMR Serverless is only supported on Privacera's Self-Managed deployments.

Supported Versions

Version Supported End-of-support date
🟢 EMR Serverless 7.2.0 Yes

How it Works

  • Privacera integrates with EMR Serverless by extending the Spark Docker image from EMR Serverless to include Privacera’s plugin and configurations.
  • The Dockerfile installs the required packages along with Privacera-specific files, including the plugin and setup script.
  • The final Docker image is a customized build that incorporates Privacera’s setup, plugins, and configurations.

User Identity Mapping

Policies in Privacera are configured for users and groups based on JWT, as well as for roles created within Privacera. These identities are mapped to the Databricks user identities as follows:

Privacera Identity EMR Identity
LDAP/AD/SCIM User JWT
LDAP/AD/SCIM Group N/A
Privacera Role N/A

Comments