Skip to content

Databricks Unity Catalog Resource Loader Configuration

This section outlines the configuration settings for resource loading behavior and threading when using the Databricks Unity Catalog Connector. These settings help improve performance by enabling parallel loading of Unity Catalog entities such as schemas, tables, columns, functions, and volumes.

Connector Type

These configurations are applicable only when the connector is configured using JDBC for Databricks Unity Catalog.

Configuration Parameters

You can configure the following:

  • Optional toggle to enable column loading
  • Thread counts for each supported resource type

If these parameters are not specified, default values may apply, depending on the system and connector version.

Setup

Warning

  • All configuration values must be entered as strings.
  • Thread-related parameters should be tuned based on the volume of metadata and available system resources.
  1. SSH into the instance where Privacera Manager is installed.

  2. Open the Unity Catalog connector configuration file:

    Note

    Replace instance1 with your actual connector instance name.

    Bash
    vi ~/privacera/privacera-manager/config/custom-vars/connectors/databricks_unity/instance1/vars.connector.databricks_unity.yml
    
  3. Configure the resource loader parameters as needed:

    YAML
    # Optional Toggle – enables loading of column metadata
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_COLUMNS: "true"
    
    # Thread Configuration – Schema
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_SCHEMA_THREADS: "2"
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_SCHEMA_MIN_THREADS: "2"
    
    # Thread Configuration – Table
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_TABLE_THREADS: "2"
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_TABLE_MIN_THREADS: "2"
    
    # Thread Configuration – Column
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_COLUMN_THREADS: "3"
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_COLUMN_MIN_THREADS: "3"
    
    # Thread Configuration – Function
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_FUNCTION_THREADS: "2"
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_FUNCTION_MIN_THREADS: "2"
    
    # Thread Configuration – Volume
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_VOLUME_THREADS: "2"
    CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_RESOURCES_LOAD_VOLUME_MIN_THREADS: "2"
    

    Usage Guide

    • CONNECTOR_DATABRICKS_UNITY_CATALOG_LOAD_COLUMNS: Enables loading of column metadata. Defaults to "true" if not set.
    • Thread properties: Controls the number of threads used to load each resource type. Increase values for faster loading at the cost of higher CPU/memory usage.
    • These properties only take effect if the connector is set up using JDBC.
  4. After updating the configuration, apply the changes by running:

    Step 1 - Setup which generates the helm charts. This step usually takes few minutes.

    Bash
    cd ~/privacera/privacera-manager
    ./privacera-manager.sh setup
    
    Step 2 - Apply the Privacera Manager helm charts.
    Bash
    cd ~/privacera/privacera-manager
    ./pm_with_helm.sh upgrade
    
    Step 3 - Post-installation step which generates Plugin tar ball, updates Route 53 DNS and so on.

    Bash
    cd ~/privacera/privacera-manager
    ./privacera-manager.sh post-install
    

Comments