Skip to content

AWS Resource Loader Configuration – Lake Formation

This section describes the configuration settings for resource loading behavior and threading when operating in Lake Formation Push/Pull Mode. These settings are designed to optimize performance when working with large datasets in AWS Lake Formation.

Configuration Parameters

You can configure the following:

  • Resource loading mode
  • Optional toggles for tag and data location loading
  • Thread counts for each resource type when using multi-threaded loading

If these parameters are not specified, default values will be applied automatically, as described below.

Defaults

  • CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_KEY defaults to load_multi_thread.
  • CONNECTOR_LAKEFORMATION_LOAD_DATA_LOCATION_ENABLED defaults to "true".
  • CONNECTOR_LAKEFORMATION_LOAD_TAG_ENABLED defaults to "true".
  • CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_THREAD_POOL_WAIT_TIMEOUT_MINUTES defaults to "1200" (20 hours).
  • CONNECTOR_LAKEFORMATION_USE_THREAD_POOL_EXECUTOR_V2 defaults to "true".

Setup

Warning

  • All configuration values must be entered as strings.
  • Thread-related parameters should be carefully tuned based on the volume of permissions data and the available system resources.
  1. SSH into the instance where Privacera Manager is installed.

  2. Open the Lake Formation connector configuration file:

    For Push Mode

    Note

    Replace instance1 with your actual connector instance name.

    Bash
    vi ~/privacera/privacera-manager/config/custom-vars/connectors/lakeformation/instance1/vars.connector.lakeformation.push.yml
    

    For Pull Mode

    Note

    Replace instance1 with your actual connector instance name.

    Bash
    vi ~/privacera/privacera-manager/config/custom-vars/connectors/lakeformation/instance1/vars.connector.lakeformation.pull.yml
    
  3. Configure the resource loader parameters as needed:

    YAML
    # Resource Loading Mode (optional – defaults to load_multi_thread)
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_KEY: "load_multi_thread"
    
    # Optional Toggles – default is "true" if not set
    CONNECTOR_LAKEFORMATION_LOAD_DATA_LOCATION_ENABLED: "true"
    CONNECTOR_LAKEFORMATION_LOAD_TAG_ENABLED: "true"
    
    # Thread Configuration – Database
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_DATABASE_THREADS: "2"
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_DATABASE_MIN_THREADS: "1"
    
    # Thread Configuration – Table
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_TABLE_THREADS: "3"
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_TABLE_MIN_THREADS: "1"
    
    # Thread Configuration – Tag
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_TAG_THREADS: "2"
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_TAG_MIN_THREADS: "1"
    
    # Thread Configuration – Data Location
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_DATA_LOCATION_THREADS: "2"
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_LOAD_DATA_LOCATION_MIN_THREADS: "1"
    
    # Thread Pool Executor Configuration
    CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_THREAD_POOL_WAIT_TIMEOUT_MINUTES: "1200"
    CONNECTOR_LAKEFORMATION_USE_THREAD_POOL_EXECUTOR_V2: "true"
    

    Usage Guide

    • CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_KEY: Sets resource loading mode. load_multi_thread is the default and recommended for large datasets.
    • CONNECTOR_LAKEFORMATION_LOAD_DATA_LOCATION_ENABLED: Enables loading of data location resource. Defaults to "true" if not set. Only supported in multi-threaded mode.
    • CONNECTOR_LAKEFORMATION_LOAD_TAG_ENABLED: Enables loading of tag resource. Defaults to "true" if not set. Only supported in multi-threaded mode.
    • Thread properties: Specify the number of threads for loading each resource type. Tune according to system capabilities.
    • CONNECTOR_LAKEFORMATION_LOAD_RESOURCES_THREAD_POOL_WAIT_TIMEOUT_MINUTES: Maximum wait time (in minutes) for resource loader thread pool tasks to complete. Defaults to "1200" (20 hours). Increase this value for large-scale resource loading operations.
    • CONNECTOR_LAKEFORMATION_USE_THREAD_POOL_EXECUTOR_V2: Enables Thread Pool Executor v2 for improved resource loading performance and better thread management. Set to "true" to use the enhanced thread pool executor. Defaults to "true".
  4. After updating the configuration, apply the changes by running:

    Step 1 - Setup which generates the helm charts. This step usually takes few minutes.

    Bash
    cd ~/privacera/privacera-manager
    ./privacera-manager.sh setup
    
    Step 2 - Apply the Privacera Manager helm charts.
    Bash
    cd ~/privacera/privacera-manager
    ./pm_with_helm.sh upgrade
    
    Step 3 - (Optional) Post-installation step which generates Plugin tar ball, updates Route 53 DNS and so on. This step is not required if you are updating only connector properties.

    Bash
    cd ~/privacera/privacera-manager
    ./privacera-manager.sh post-install