Skip to content

About Object Level Access Control (OLAC)

Object Level Access Control (OLAC) is a security design pattern that manages access to entire data files or objects stored in cloud storage systems such as Amazon S3, Azure Data Lake Storage (ADLS), Google Cloud Storage (GCS), or MinIO. Unlike Fine-Grained Access Control (FGAC), which restricts access to specific rows or columns, OLAC focuses on authorizing users or services to entire files or folders. It plays a critical role in securing data for Apache Spark and other analytics and compute engines.

How OLAC Works

OLAC uses a centralized policy engine Privacera DataServer to enforce access to object stores. When a Spark job (or any compute workload) requests access to a file, Privacera evaluates the user's permissions and then issues short-lived credentials through Privacera DataServer. These credentials are scoped to the specific objects the user is allowed to access.

For more details you can read here about the Privacera DataServer

Difference from FGAC

OLAC is primarily used in environments where the client application (like Spark) does not have reliable way to secure untrusted code. On the other hand where FGAC is used, the client application can be trusted to enforce security policies. For example, in a trusted environment like Snowflake or Databricks Unity Catalog, they don't allow untrusted code to run and in other words users can't exploit the system to access data they are not authorized to access. In such cases, FGAC is a better fit. This is applicable for the following scenarios:

  1. Accessing AWS S3 or object store when FGAC is enabled in Databricks Cluster
  2. Creating external tables in Databricks Cluster with FGAC
  3. Creating external tables in Trino or Apache Hive with FGAC
  4. Creating external tables in Apache Hive

Key Components

  • Privacera Policy Manager: Defines policies mapping users to folders/files.
  • Privacera DataServer: Evaluates access in real-time and vends secure, scoped credentials.
  • No Cluster IAM Role Needed: Spark clusters don’t need privileged access to cloud storage.
  • Audit Logging: All access requests and decisions are logged for compliance.

Benefits of OLAC

  1. Security by Least Privilege: Users only get access to objects they are authorized for.
  2. Credential Isolation: Temporary credentials avoid risk from long-lived IAM roles on clusters.
  3. Centralized Management: Policies are maintained in Privacera and enforced consistently.
  4. High Performance: Privacera DataServer is only involved in providing credentials, not in data transfer. Compute engines can directly access the data once they have the credentials.

OLAC in Apache Spark

OLAC is commonly used in Apache Spark environments where data is accessed from object stores:

Design Highlights

  • Spark workers don’t use IAM roles directly.
  • Jobs request access via the Privacera plugin.
  • DataServer provides pre-signed URLs or credentials to fetch only authorized objects.

Supported Deployments

  • Apache Spark on Kubernetes
  • Apache Spark on Amazon EMR / EMR Serverless
  • Apache Spark in Databricks (non-Unity Catalog)
  • Apache Spark in Databricks SQL Warehouse

Use Cases

1. Data Engineering Pipeline

  • A Spark job consolidates logs from S3.
  • OLAC ensures that only data from specific prefixes (e.g., s3://your_bucket/logs/region-west/) are accessible to the data engineer.

2. Machine Learning Training

  • A ML model requires access to anonymized data only.
  • OLAC policies restrict access to pre-approved datasets, ensuring secure training pipelines.

3. Secure BI Dashboarding

  • BI tools using Spark to generate dashboards must only use files authorized for analysts.
  • OLAC ensures that object-level access control is applied even when queries are executed in SparkSQL.

Limitations of OLAC

  1. No Row/Column Control: Cannot filter or mask data within files.
  2. Not Ideal for Shared Files: OLAC does not work well if sensitive and non-sensitive data coexist in a single file.

Why Use OLAC with Privacera

  • Plug-and-Play: Integrates with existing object stores without changing file formats.
  • Real-Time Decisions: Access control decisions are made in real time.
  • Complementary to FGAC: Use OLAC for access to files and FGAC for filtering within files.
  • Audit Ready: Tracks who accessed what data, when, and how.

Comments