Access Management for Databricks all-purpose compute clusters with Object-Level Access Control (OLAC)ΒΆ
IntroductionΒΆ
Privacera seamlessly integrates with Databricks all-purpose compute clusters that support Object-Level Access Control (OLAC), enabling the enforcement of data access policies, monitoring of data usage, and ensuring regulatory compliance. This document outlines the key features, benefits, and configuration steps for integrating Databricks all-purpose compute clusters with Privacera.
Connector DetailsΒΆ
Topics | Details |
---|---|
Integration methodology | Dataserver Signature generation |
Access Tools | Databricks Console, JDBC |
Supported User Identities for Policies |
|
Data Source User Identities |
|
Supported Access Management FeaturesΒΆ
Feature | Supported | Native | Using SecureView |
---|---|---|---|
S3 Files Access control (s3a/s3n/s3) | Yes | No | N/A |
Azure Data Lake Access control(abfs/abfss) | Yes | No | N/A |
Centralized Access Audit | Yes | N/A | N/A |
Granular Access Audit Record | Yes | N/A | N/A |
Database Access Control | No | No | N/A |
Table Access Control | No | No | N/A |
View Access Control | No | No | N/A |
Column Access Control | No | No | N/A |
Row Access Control | No | No | N/A |
Dynamic Column Data Masking | No | No | N/A |
Dynamic Column Data Encryption | No | No | N/A |
DBFS Files Access control | No | No | N/A |
Supported Databricks matrixΒΆ
Here is the supported Databricks matrix for Privacera integration with Databricks all-purpose compute clusters:
Supported runtime versionsΒΆ
Databricks offers multiple runtime versions, Privacera supports the following runtime versions:
Language | Supported | End-of-support date |
---|---|---|
7.3 LTS | No (Limited Support) | |
9.1 LTS | Yes | |
10.4 LTS | Yes | |
11.3 LTS | Yes | |
12.2 LTS | Yes | |
13.3 LTS | Yes | |
14.3 LTS | Yes | |
15.4 LTS | Yes |
Notebook languagesΒΆ
Databricks supports multiple languages in the notebook, Privacera supports the following languages:
Language | Supported |
---|---|
python (%python) | Yes |
SQL (%sql) | Yes |
Scala (%scala) | Yes |
R (%r) | Yes |
hadoop fs (%fs) | Yes |
Note
R language is not supported on clusters with Shared Access Mode.
Supported Databricks cluster deployment matrixΒΆ
Here are the supported cluster types for Privacera integration with Databricks all-purpose compute clusters:
Interactive clusterΒΆ
For interactive clusters, Privacera supports the following cluster types:
Cluster type | Supported |
---|---|
Standard (Scala/Python/R/SQL) | Yes |
High Concurrency (Python/R/SQL) | Yes |
Single Node (Scala/Python/R/SQL) | Yes |
Job on new clusterΒΆ
For jobs on new clusters, Privacera supports the following job types:
Job type | Supported |
---|---|
Notebook | Yes |
JAR (scala/java) | Yes |
spark-submit | Yes |
Python | No |
Python wheel | No |
Delta Live Tables pipeline | No |
Job on existing clusterΒΆ
For jobs on existing clusters, Privacera supports the following job types:
Job type | Supported |
---|---|
Notebook | No |
JAR (scala/java) | No |
spark-submit | No |
Delta Live Tables pipeline | No |
Python wheel | No |
Python | No |
How it WorksΒΆ
Privacera integrates with Databricks all-purpose compute clusters using the Privacera Spark plugin, which is deployed via init scripts during cluster creation. The plugin calls the Privacera Dataserver to obtain a signature, which is subsequently authorized based on the Apache Ranger plugin.
User Identity MappingΒΆ
The policies in Privacera configured for the users and groups from AD/LDAP or SCIM and roles created in Privacera. These identities are mapped to the Databricks user identities as follows:
Privacera Identity | Databricks Identity | Notes |
---|---|---|
LDAP/AD/SCIM User | Email Address/ Databricks Service Principals | |
LDAP/AD/SCIM Group | N/A | |
Privacera Role | N/A |
The Apache Ranger plugin, which operates as part of the Dataserver, maps the user's email address to the AD/SCIM user. The groups and roles corresponding to the user are dynamically fetched from Privacera and utilized to enforce group and role-based policies in the Databricks clusters.
We also support JWT token user-identity
Privacera Identity | JWT token user-identity | Notes |
---|---|---|
LDAP/AD/SCIM User | JWT payload user | |
LDAP/AD/SCIM Group | JWT payload group/scope | The user group mapping will be extracted from the JWT token payload, eliminating the need for explicit mapping for access control. |
Privacera Role | N/A |
Any attribute based access control (ABAC) and tag based policies configured in Privacera are enforced by the Apache Ranger plugin at runtime.
- Prev topic: About Databricks Clusters - OLAC
- Next topic: Prerequisites