Skip to content

Access Management for Databricks all-purpose compute clusters with Object-Level Access Control (OLAC)ΒΆ

IntroductionΒΆ

Privacera seamlessly integrates with Databricks all-purpose compute clusters that support Object-Level Access Control (OLAC), enabling the enforcement of data access policies, monitoring of data usage, and ensuring regulatory compliance. This document outlines the key features, benefits, and configuration steps for integrating Databricks all-purpose compute clusters with Privacera.

Connector DetailsΒΆ

Topics Details
Integration methodology Dataserver Signature generation
Access Tools Databricks Console, JDBC
Supported User Identities for Policies
  • LDAP/AD/SCIM Users
  • LDAP/AD/SCIM Groups
  • Privacera Roles
Data Source User Identities
  • SAML/SSO
  • Databricks Login using Email Address
  • Databricks Token
  • Databricks Service Principal
  • JWT token file in cluster

Supported Access Management FeaturesΒΆ

Feature Supported Native Using SecureView
🟒 S3 Files Access control (s3a/s3n/s3) Yes No N/A
🟒 Azure Data Lake Access control(abfs/abfss) Yes No N/A
🟒 Centralized Access Audit Yes N/A N/A
🟒 Granular Access Audit Record Yes N/A N/A
πŸ”΄ Database Access Control No No N/A
πŸ”΄ Table Access Control No No N/A
πŸ”΄ View Access Control No No N/A
πŸ”΄ Column Access Control No No N/A
πŸ”΄ Row Access Control No No N/A
πŸ”΄ Dynamic Column Data Masking No No N/A
πŸ”΄ Dynamic Column Data Encryption No No N/A
πŸ”΄ DBFS Files Access control No No N/A

Supported Databricks matrixΒΆ

Here is the supported Databricks matrix for Privacera integration with Databricks all-purpose compute clusters:

Supported runtime versionsΒΆ

Databricks offers multiple runtime versions, Privacera supports the following runtime versions:

Language Supported End-of-support date
πŸ”΄ 7.3 LTS No (Limited Support)
🟒 9.1 LTS Yes
🟒 10.4 LTS Yes
🟒 11.3 LTS Yes
🟒 12.2 LTS Yes
🟒 13.3 LTS Yes
🟒 14.3 LTS Yes
🟒 15.4 LTS Yes

Notebook languagesΒΆ

Databricks supports multiple languages in the notebook, Privacera supports the following languages:

Language Supported
🟒 python (%python) Yes
🟒 SQL (%sql) Yes
🟒 Scala (%scala) Yes
🟒 R (%r) Yes
🟒 hadoop fs (%fs) Yes

Note

R language is not supported on clusters with Shared Access Mode.

Supported Databricks cluster deployment matrixΒΆ

Here are the supported cluster types for Privacera integration with Databricks all-purpose compute clusters:

Interactive clusterΒΆ

For interactive clusters, Privacera supports the following cluster types:

Cluster type Supported
🟒 Standard (Scala/Python/R/SQL) Yes
🟒 High Concurrency (Python/R/SQL) Yes
🟒 Single Node (Scala/Python/R/SQL) Yes

Job on new clusterΒΆ

For jobs on new clusters, Privacera supports the following job types:

Job type Supported
🟒 Notebook Yes
🟒 JAR (scala/java) Yes
🟒 spark-submit Yes
πŸ”΄ Python No
πŸ”΄ Python wheel No
πŸ”΄ Delta Live Tables pipeline No

Job on existing clusterΒΆ

For jobs on existing clusters, Privacera supports the following job types:

Job type Supported
πŸ”΄ Notebook No
πŸ”΄ JAR (scala/java) No
πŸ”΄ spark-submit No
πŸ”΄ Delta Live Tables pipeline No
πŸ”΄ Python wheel No
πŸ”΄ Python No

How it WorksΒΆ

Privacera integrates with Databricks all-purpose compute clusters using the Privacera Spark plugin, which is deployed via init scripts during cluster creation. The plugin calls the Privacera Dataserver to obtain a signature, which is subsequently authorized based on the Apache Ranger plugin.

User Identity MappingΒΆ

The policies in Privacera configured for the users and groups from AD/LDAP or SCIM and roles created in Privacera. These identities are mapped to the Databricks user identities as follows:

Privacera Identity Databricks Identity Notes
LDAP/AD/SCIM User Email Address/ Databricks Service Principals
LDAP/AD/SCIM Group N/A
Privacera Role N/A

The Apache Ranger plugin, which operates as part of the Dataserver, maps the user's email address to the AD/SCIM user. The groups and roles corresponding to the user are dynamically fetched from Privacera and utilized to enforce group and role-based policies in the Databricks clusters.

We also support JWT token user-identity

Privacera Identity JWT token user-identity Notes
LDAP/AD/SCIM User JWT payload user
LDAP/AD/SCIM Group JWT payload group/scope The user group mapping will be extracted from the JWT token payload, eliminating the need for explicit mapping for access control.
Privacera Role N/A

Any attribute based access control (ABAC) and tag based policies configured in Privacera are enforced by the Apache Ranger plugin at runtime.

Comments