Skip to content

Apache Flink on Kubernetes - Access Control

Introduction

Apache Flink on Kubernetes clusters are designed for distributed stream processing use cases, where multiple tasks can be deployed to run concurrent real-time data processing jobs. These clusters support fine-grained resource management and job isolation, enabling scalable data pipelines. Privacera’s Object-Level Access Control (OLAC) in Flink on Kubernetes includes the following features:

  • Granular Data Access for objects in AWS S3
  • Access Audit Records
  • Token-Based Authentication

Policies can be defined using object-level policies, tag-based policies, and attribute-based access control (ABAC) policies.

Connector Details

Topics Details
Integration methodology Privacera DataServer
Access Tools
  • Job Submission via JAR
  • Job Submission via Docker Container
Supported User Identities for Policies
  • LDAP/AD/SCIM Users
  • LDAP/AD/SCIM Groups
  • Privacera Roles
Data Source User Identities

Supported Access Management Features

Feature Supported
🟢 S3 Files Access control (s3a/s3n/s3) Yes
🟢 Centralized Access Audit Yes
🟢 Granular Access Audit Record Yes
🔴 View Access Control N/A
🔴 Column Access Control N/A
🔴 Row Access Control N/A
🔴 Dynamic Column Data Masking N/A
🔴 Dynamic Column Data Encryption N/A

Apache Flink has multiple versions, Privacera supports the following versions:

Apache Flink Version Supported
🟢 1.18.1 Yes

⚠Limitations for Access Management Features

  1. Only JWT is supported for user identity mapping.
  2. Only supports Apache Flink deployed in Kubernetes.
  3. Only supported in Self Managed Privacera deployment

How it Works

Privacera enforces S3 policies with Apache Flink Kubernetes clusters through the Privacera DataServer integration. During the creation of the Docker image for Apache Flink, the Privacera plugin and dependent libraries are added to the Docker image.

The plugin is configured to intercept the S3 requests at the AWS SDK level and it makes an external call to the Privacera DataServer to validate the request against the configured policies and then returns temporary AWS STS token or Signed S3 URL for the objects and this is used by the AWS S3 SDKto access the S3 data. For additional details, refer to the Privacera DataServer section.

User Identity Mapping

Policies in Privacera are configured for users and groups, as well as for roles created within Privacera. These identities are mapped to the Apache Flink user identities using JWT tokens.

Privacera Identity EMR Identity
LDAP/AD/SCIM User JWT
LDAP/AD/SCIM Group N/A
Privacera Role N/A

Comments