Skip to content

Configure API-Driven On-Demand Sync for Databricks Unity Catalog Connector

Overview

API-driven On-Demand Sync enables real-time, targeted policy synchronization in Databricks Unity Catalog by allowing you to invoke resource synchronization through REST API calls. Unlike scheduled sync operations that process all resources periodically, API-driven On-Demand Sync provides programmatic control to trigger immediate synchronization for specific resources when changes occur.

This path uses the same On-Demand Sync V2 task pipeline as event-driven (Azure Event Hub) On-Demand Sync; you can enable the API server alone, Event Hub alone, or both, depending on how you want to trigger sync tasks.

D2P Mode Only

API-driven On-Demand Sync for Databricks Unity Catalog is currently supported only in Data Plane (D2P) deployment mode.

Current Implementation Limitations

  • The connector does not load specific functions, volumes, or models individually; however, they are loaded when retrieving the catalog or schema.
  • The connector is unable to load service credentials.

Key Benefits

Benefit Description
Real-Time Updates Policy changes are applied soon after the API is invoked
Targeted Sync Synchronizes only the affected resources, not the entire catalog
Reduced Load Minimizes connector processing overhead by syncing only what is needed
Programmatic Control Direct API integration for applications and automation
Immediate Response Synchronous feedback on task creation and status

How It Works

API-driven On-Demand Sync follows this workflow:

  1. API Invocation: Your application sends a REST request to create a sync task.
  2. Request Processing: The connector API server validates the request (including JWT authentication).
  3. Task Creation: A sync task is created and processed by the On-Demand Sync V2 pipeline.
  4. Resource Sync: The connector loads the resources specified in the request from Unity Catalog and applies policy updates.
  5. Response: The API returns task identifiers and status; you can poll task status with the GET endpoint.

Configuration Properties

Privacera Manager must enable both the API server and On-Demand Sync V2 so that API server settings are written to the connector configuration.

Required Properties

Property Description Example
CONNECTOR_DATABRICKS_UNITY_CATALOG_API_SERVER_ENABLED Enable the connector API server for on-demand sync true
CONNECTOR_DATABRICKS_UNITY_CATALOG_ON_DEMAND_V2_ENABLED Enable On-Demand Sync V2 (required together with the API server flag) true
CONNECTOR_DATABRICKS_UNITY_CATALOG_EXTERNAL_ACCESS_ENABLED Enable external access to the API server (for example, Kubernetes ingress) true

Optional Properties

Property Default Description
CONNECTOR_DATABRICKS_UNITY_CATALOG_K8S_NGINX_INGRESS_ENABLE false Enable Kubernetes NGINX Ingress for API server access

Event Hub triggers

If you also use Azure Event Hub for on-demand events, configure the CONNECTOR_ON_DEMAND_V2_* properties described in Configure Event-Driven On-Demand Sync. If you rely only on the REST API, set CONNECTOR_ON_DEMAND_V2_AZURE_EVENT_HUB_ENABLED to false and omit Event Hub connection settings.

Setup

Step 1: Edit Connector Configuration

SSH to the instance where Privacera is installed and edit your connector configuration file:

Bash
cd ~/privacera/privacera-manager/config
vi custom-vars/connectors/databricks-unity-catalog/instance1/vars.connector.databricks.unity.catalog.yml

Step 2: Add API-Driven On-Demand Sync Configuration

Add the following configuration to your connector YAML file:

YAML
# Enable API Server for On-Demand Sync
CONNECTOR_DATABRICKS_UNITY_CATALOG_API_SERVER_ENABLED: "true"

# Enable On-Demand Sync V2 (required with API server)
CONNECTOR_DATABRICKS_UNITY_CATALOG_ON_DEMAND_V2_ENABLED: "true"

# Enable External Access
CONNECTOR_DATABRICKS_UNITY_CATALOG_EXTERNAL_ACCESS_ENABLED: "true"

# Optional: Kubernetes NGINX Ingress (default: false)
CONNECTOR_DATABRICKS_UNITY_CATALOG_K8S_NGINX_INGRESS_ENABLE: "false"

Step 3: Deploy Configuration

Once the properties are configured, run the following commands to update your Privacera Manager platform instance:

Step 1 - Setup which generates the helm charts. This step usually takes few minutes.

Bash
cd ~/privacera/privacera-manager
./privacera-manager.sh setup
Step 2 - Apply the Privacera Manager helm charts.
Bash
cd ~/privacera/privacera-manager
./pm_with_helm.sh upgrade
Step 3 - (Optional) Post-installation step which generates Plugin tar ball, updates Route 53 DNS and so on. This step is not required if you are updating only connector properties.

Bash
cd ~/privacera/privacera-manager
./privacera-manager.sh post-install

API Usage

Retrieve Host

The host is generated during post-install. After running ./privacera-manager.sh post-install, a service-urls.txt file is generated in the output directory. To retrieve the host:

  1. SSH to the instance where Privacera is installed and navigate to the Privacera Manager directory:

    Bash
    cd ~/privacera/privacera-manager
    
  2. Read the host URL from the service-urls.txt file:

    Bash
    cat output/service-urls.txt
    
  3. For the required connector instance, copy the external URL from the output to use in the API endpoint URL.

Endpoint

The REST API endpoint for triggering on-demand sync:

Text Only
POST https://<host>/policysync/api/v1/task

Replace <host> with your connector API server hostname (from the Retrieve Host section).

Authentication

The API requires JWT authentication. After deployment, Privacera Manager generates a token under the connector API server keys path. Replace <instance-name> with your connector instance name (for example, instance1):

Bash
cat config/api-server-keys/connector/databricks-unity-catalog/<instance-name>/jwt/connector-databricks-unity-catalog-<instance-name>-jwt-token.txt

Use the token value in the Authorization header when calling the API.

Request Headers

Header Required Description
Content-Type Yes application/json
Authorization Yes JWT bearer token from the generated token file

Request Payload Structure

Send a JSON body with task type RESOURCE_SYNC and a requestInfo.resources array. Resource identifiers use catalog, schema, and table (Unity Catalog naming), not database/schema as in Snowflake.

Sample Request Payload

JSON
{
    "type": "RESOURCE_SYNC",
    "requestInfo": {
        "resources": [
            {
                "type": "table",
                "values": {
                    "catalog": "main",
                    "schema": "sales",
                    "table": "customers"
                }
            }
        ]
    }
}

Payload Field Descriptions

Field Type Required Description
type String Yes Task type. Use RESOURCE_SYNC for resource synchronization
requestInfo.resources Array Yes List of resources to sync
requestInfo.resources[].type String Yes Resource type: catalog, schema, table, or view
requestInfo.resources[].values Object Yes Resource identifiers (catalog, schema, table as applicable)

Resource Types and Values

Catalog:

JSON
1
2
3
4
5
6
{
  "type": "catalog",
  "values": {
    "catalog": "main"
  }
}

Schema:

JSON
1
2
3
4
5
6
7
{
  "type": "schema",
  "values": {
    "catalog": "main",
    "schema": "sales"
  }
}

Table:

JSON
1
2
3
4
5
6
7
8
{
  "type": "table",
  "values": {
    "catalog": "main",
    "schema": "sales",
    "table": "customers"
  }
}

View:

JSON
1
2
3
4
5
6
7
8
{
  "type": "view",
  "values": {
    "catalog": "main",
    "schema": "sales",
    "table": "revenue_by_region"
  }
}

Multiple Resources Example

JSON
{
    "type": "RESOURCE_SYNC",
    "requestInfo": {
        "resources": [
            {
                "type": "catalog",
                "values": {
                    "catalog": "hr_catalog"
                }
            },
            {
                "type": "schema",
                "values": {
                    "catalog": "hr_catalog",
                    "schema": "employee_data"
                }
            },
            {
                "type": "table",
                "values": {
                    "catalog": "hr_catalog",
                    "schema": "employee_data",
                    "table": "employees"
                }
            }
        ]
    }
}

Optional Event-Style Fields

For parity with Event Hub payloads, you may include optional fields such as id, appType, appSubType, source, and createTime. See Configure Event-Driven On-Demand Sync for examples.

Retrieve Task Status

To check the status of a sync task, use the GET endpoint with the task ID returned from the POST call:

Text Only
GET https://<host>/policysync/api/v1/tasks/<taskId>

Example response:

JSON
1
2
3
4
5
{
    "taskId": "574057046565109672",
    "status": "SUCCESS",
    "message": "Sync completed successfully"
}

Task Status

After processing, each sync task will have one of the following statuses:

Status Description
NEW Task has been created but not yet started
WAITING Task is waiting to be processed
INPROGRESS Sync task is currently being processed
SUCCESS Sync completed successfully
FAILED Sync failed due to an error
SKIPPED Task skipped (for example, invalid request or unsupported resource type)