Skip to content

Tag Sync for Collibra

This document provides an overview of the tag sync feature supported by Privacera for Collibra.

Tag Sync

Topic Detail
Integration methodology Privacera PolicySync
Connector type Tag-only sync connector (no access, user, permission, discovery, or encryption sync)
Source of tags Collibra REST API 2.0 (/rest/2.0/...), HTTP Basic authentication
Sync targets Apache Ranger tag services (and OMNI Metadata Service when OMNI is enabled)
PolicySync service type collibra
Supported engines Hive, Trino, Snowflake (one connector instance can target several at once)

Supported Tag Sync Features

Feature Supported
🟢 Tag sync from Collibra to Ranger Yes
🟢 Tag sync to OMNI Metadata Service Yes (when OMNI is enabled)
🟢 Multi-engine sync (Hive/Trino/Snowflake) from one connector Yes
🟢 Incremental sync (only changes pushed) Yes
🟢 Tag removal (resource mappings) Yes
🟢 Tag definition removal in Ranger Yes
🟢 Scan-failure safeguards (no spurious deletes) Yes
🔴 User / group / role sync No
🔴 Permission / access policy sync No

What it does

Tag sync runs on a timer. On each run the connector:

  1. Reads tags from Collibra for the assets under the connections you configure (connection → catalog/database → schema → table → column).
  2. Compares what it found against the snapshot it stored at the end of the last successful run.
  3. Pushes only the changes — new and updated tag mappings, and removals — to the correct Ranger tag service for each engine, and to OMNI Metadata Service when OMNI is enabled.
flowchart LR
    A[(Collibra<br/>assets + tags)] -->|read via REST 2.0| B[Collibra connector]
    B -->|diff vs last snapshot| B
    B -->|push only changes| C[(Ranger: privacera_hive)]
    B --> D[(Ranger: privacera_trino)]
    B --> E[(Ranger: privacera_snowflake)]
    B -.OMNI enabled.-> F[(OMNI Metadata Service)]
    C & D & E -->|tags available for| G[Tag-based policies<br/>enforced by access connectors]

Multi-engine routing

A single connector instance routes tags to multiple Ranger services using two mapping properties:

  • collibra.connection.to.service.mapping — maps each Collibra connection name to its default engine, for example sbtConnection:trino.
  • collibra.ranger.service.mapping — maps each engine to its Ranger tag service, for example hive:privacera_hive,trino:privacera_trino,snowflake:privacera_snowflake.

A catalog whose name matches an engine key (for example a catalog literally named hive) is routed to that engine automatically; any other catalog uses the connection's default engine. This lets one physical Collibra connection produce tags for more than one Ranger service in a single run.

Where the tags end up

Tags are applied to Ranger tag services such as privacera_hive and resolved against the matching engine resource hierarchy:

  • Hive — database / table / column
  • Trino — catalog / schema / table / column
  • Snowflake — database / schema / table / column

After the tags are in Ranger, you create tag-based policies and your existing access connectors enforce them on the underlying data.