Skip to content

Access Management for Databricks all-purpose compute clusters with Object-Level Access Control (OLAC)

Introduction

Privacera seamlessly integrates with Databricks all-purpose compute clusters that support Object-Level Access Control (OLAC), enabling the enforcement of data access policies, monitoring of data usage, and ensuring regulatory compliance. This document outlines the key features, benefits, and configuration steps for integrating Databricks all-purpose compute clusters with Privacera.

Connector Details

Topics Details
Integration methodology Dataserver Signature generation
Access Tools Databricks Console, JDBC
Supported User Identities for Policies
  • LDAP/AD/SCIM Users
  • LDAP/AD/SCIM Groups
  • Privacera Roles
Data Source User Identities
  • SAML/SSO
  • Databricks Login using Email Address
  • Databricks Token
  • Databricks Service Principal
  • JWT token file in cluster

Supported Access Management Features

Feature Supported Native Using SecureView
🟢 S3 Files Access control (s3a/s3n/s3) Yes No N/A
🟢 Azure Data Lake Access control(abfs/abfss) Yes No N/A
🟢 Centralized Access Audit Yes N/A N/A
🟢 Granular Access Audit Record Yes N/A N/A
🔴 Database Access Control No No N/A
🔴 Table Access Control No No N/A
🔴 View Access Control No No N/A
🔴 Column Access Control No No N/A
🔴 Row Access Control No No N/A
🔴 Dynamic Column Data Masking No No N/A
🔴 Dynamic Column Data Encryption No No N/A
🔴 DBFS Files Access control No No N/A

Supported Databricks matrix

Here is the supported Databricks matrix for Privacera integration with Databricks all-purpose compute clusters:

Supported runtime versions

Databricks offers multiple runtime versions, Privacera supports the following runtime versions:

Databricks Runtime Version Privacera Release Version Scala Version End-of-support date
🟢 17.3 LTS 9.2.12.1 2.13 Oct 22, 2028
🟢 16.4 LTS 9.0.29.1 2.12 May 9, 2028
🟢 15.4 LTS 9.0.1.1 2.12 Aug 19, 2027
🟢 14.3 LTS 9.0.1.1 2.12 Feb 1, 2027
🟢 13.3 LTS 9.0.1.1 2.12 Aug 22, 2026
Databricks Runtime Version Privacera Release Version Scala Version End-of-support date
🟢 17.3 LTS 9.2.13.1 2.13 Oct 22, 2028
🟢 16.4 LTS 9.2.9.1 2.12 May 9, 2028
🟢 15.4 LTS 9.0.1.1 2.12 Aug 19, 2027
🟢 14.3 LTS 9.0.1.1 2.12 Feb 1, 2027
🟢 13.3 LTS 9.0.1.1 2.12 Aug 22, 2026

Notebook languages

Databricks supports multiple languages in the notebook, Privacera supports the following languages:

Language Supported
🟢 python (%python) Yes
🟢 SQL (%sql) Yes
🟢 Scala (%scala) Yes
🟢 R (%r) Yes
🟢 hadoop fs (%fs) Yes

Note

R language is not supported on clusters with Shared Access Mode.

Supported Databricks cluster deployment matrix

Here are the supported cluster types for Privacera integration with Databricks all-purpose compute clusters:

Interactive cluster

For interactive clusters, Privacera supports the following cluster types:

Cluster type Supported
🟢 Standard (Scala/Python/R/SQL) Yes
🟢 High Concurrency (Python/R/SQL) Yes
🟢 Single Node (Scala/Python/R/SQL) Yes

Job on new cluster

For jobs on new clusters, Privacera supports the following job types:

Job type Supported
🟢 Notebook Yes
🟢 JAR (scala/java) Yes
🟢 spark-submit Yes
🔴 Python No
🔴 Python wheel No
🔴 Delta Live Tables pipeline No

Job on existing cluster

For jobs on existing clusters, Privacera supports the following job types:

Job type Supported
🔴 Notebook No
🔴 JAR (scala/java) No
🔴 spark-submit No
🔴 Delta Live Tables pipeline No
🔴 Python wheel No
🔴 Python No

How it Works

Privacera integrates with Databricks all-purpose compute clusters using the Privacera Spark plugin, which is deployed via init scripts during cluster creation. The plugin calls the Privacera Dataserver to obtain a signature, which is subsequently authorized based on the Apache Ranger plugin.

User Identity Mapping

The policies in Privacera configured for the users and groups from AD/LDAP or SCIM and roles created in Privacera. These identities are mapped to the Databricks user identities as follows:

Privacera Identity Databricks Identity Notes
LDAP/AD/SCIM User Email Address/ Databricks Service Principals
LDAP/AD/SCIM Group N/A
Privacera Role N/A

The Apache Ranger plugin, which operates as part of the Dataserver, maps the user's email address to the AD/SCIM user. The groups and roles corresponding to the user are dynamically fetched from Privacera and utilized to enforce group and role-based policies in the Databricks clusters.

We also support JWT token user-identity

Privacera Identity JWT token user-identity Notes
LDAP/AD/SCIM User JWT payload user
LDAP/AD/SCIM Group JWT payload group/scope The user group mapping will be extracted from the JWT token payload, eliminating the need for explicit mapping for access control.
Privacera Role N/A

Any attribute based access control (ABAC) and tag based policies configured in Privacera are enforced by the Apache Ranger plugin at runtime.