Tag Sync for Collibra¶

This document provides an overview of the tag sync feature supported by Privacera for Collibra.

Tag Sync¶

Topic	Detail
Integration methodology	Privacera PolicySync
Connector type	Tag-only sync connector (no access, user, permission, discovery, or encryption sync)
Source of tags	Collibra REST API 2.0 (`/rest/2.0/...`), HTTP Basic authentication
Sync targets	Apache Ranger tag services (and OMNI Metadata Service when OMNI is enabled)
PolicySync service type	`collibra`
Supported engines	Hive, Trino, Snowflake (one connector instance can target several at once)

Supported Tag Sync Features¶

Feature	Supported
Tag sync from Collibra to Ranger	Yes
Tag sync to OMNI Metadata Service	Yes (when OMNI is enabled)
Multi-engine sync (Hive/Trino/Snowflake) from one connector	Yes
Incremental sync (only changes pushed)	Yes
Tag removal (resource mappings)	Yes
Tag definition removal in Ranger	Yes
Scan-failure safeguards (no spurious deletes)	Yes
User / group / role sync	No
Permission / access policy sync	No

What it does¶

Tag sync runs on a timer. On each run the connector:

Reads tags from Collibra for the assets under the connections you configure (connection → catalog/database → schema → table → column).
Compares what it found against the snapshot it stored at the end of the last successful run.
Pushes only the changes — new and updated tag mappings, and removals — to the correct Ranger tag service for each engine, and to OMNI Metadata Service when OMNI is enabled.

flowchart LR
    A[(Collibra<br/>assets + tags)] -->|read via REST 2.0| B[Collibra connector]
    B -->|diff vs last snapshot| B
    B -->|push only changes| C[(Ranger: privacera_hive)]
    B --> D[(Ranger: privacera_trino)]
    B --> E[(Ranger: privacera_snowflake)]
    B -.OMNI enabled.-> F[(OMNI Metadata Service)]
    C & D & E -->|tags available for| G[Tag-based policies<br/>enforced by access connectors]

Multi-engine routing¶

A single connector instance routes tags to multiple Ranger services using two mapping properties:

collibra.connection.to.service.mapping — maps each Collibra connection name to its default engine, for example sbtConnection:trino.
collibra.ranger.service.mapping — maps each engine to its Ranger tag service, for example hive:privacera_hive,trino:privacera_trino,snowflake:privacera_snowflake.

A catalog whose name matches an engine key (for example a catalog literally named hive) is routed to that engine automatically; any other catalog uses the connection's default engine. This lets one physical Collibra connection produce tags for more than one Ranger service in a single run.

Where the tags end up¶

Tags are applied to Ranger tag services such as privacera_hive and resolved against the matching engine resource hierarchy:

Hive — database / table / column
Trino — catalog / schema / table / column
Snowflake — database / schema / table / column

After the tags are in Ranger, you create tag-based policies and your existing access connectors enforce them on the underlying data.

Prev topic: Overview
Next topic: Prerequisites